[lecture notes in computer science] computer vision – accv 2006 volume 3851 ||

Lecture Notes in Computer Science 3851Commenced Publication in 1973Founding and Former Series Editors:Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board

David HutchisonLancaster University, UK

Takeo KanadeCarnegie Mellon University, Pittsburgh, PA, USA

Josef KittlerUniversity of Surrey, Guildford, UK

Jon M. KleinbergCornell University, Ithaca, NY, USA

Friedemann MatternETH Zurich, Switzerland

John C. MitchellStanford University, CA, USA

Moni NaorWeizmann Institute of Science, Rehovot, Israel

Oscar NierstraszUniversity of Bern, Switzerland

C. Pandu RanganIndian Institute of Technology, Madras, India

Bernhard SteffenUniversity of Dortmund, Germany

Madhu SudanMassachusetts Institute of Technology, MA, USA

Demetri TerzopoulosNewYork University, NY, USA

Doug TygarUniversity of California, Berkeley, CA, USA

MosheY. VardiRice University, Houston, TX, USA

Gerhard WeikumMax-Planck Institute of Computer Science, Saarbruecken, Germany

P.J. Narayanan Shree K. NayarHeung-Yeung Shum (Eds.)

Computer Vision ACCV 2006

7th Asian Conference on Computer VisionHyderabad, India, January 13-16, 2006Proceedings, Part I

1 3

Volume Editors

P.J. NarayananCentre for Visual Information TechnologyInternational Institute of Information TechnologyGachibowli, Hyderabad 500032, IndiaE-mail: [email protected]

Shree K. NayarColumbia University, Department of Computer Science530 West 120th Street, NewYork, NY 10027, USAE-mail: [email protected]

Heung-Yeung ShumMicrosoft Research Asia49 Zhichun Road, Beijing 100080, ChinaE-mail: [email protected]

Library of Congress Control Number: 2005938106

CR Subject Classification (1998): I.4, I.5, I.2.10, I.2.6, I.3.5, F.2.2

ISSN 0302-9743ISBN-10 3-540-31219-6 Springer Berlin Heidelberg NewYorkISBN-13 978-3-540-31219-2 Springer Berlin Heidelberg NewYork

This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publicationor parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,in its current version, and permission for use must always be obtained from Springer. Violations are liableto prosecution under the German Copyright Law.

Springer is a part of Springer Science+Business Media

springer.com

Springer-Verlag Berlin Heidelberg 2006Printed in Germany

Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, IndiaPrinted on acid-free paper SPIN: 11612032 06/3142 5 4 3 2 1 0

Preface

Welcome to the 7th Asian Conference on Computer Vision. It gives us great plea-sure to bring forth its proceedings. ACCV has been making its rounds throughthe Asian landscape and came to India this year. We are proud of the technicalprogram we have put together and we hope you enjoy it.

Interest in computer vision is increasing and ACCV 2006 attracted about 500submission. The evaluation team consisted of 27 experts serving as Area Chairsand about 270 reviewers in all. The whole process was conducted electronicallyin a double-blind manner, a first for ACCV. Each paper was assigned to an AreaChair who found three competent reviewers for it. We were able to contain themaximum load on the reviewers to nine and the average load to less than six.The review form had space for qualitative and quantitative evaluation of thepaper on nine aspects. The submitted reviews underwent an elaborate process.First, they were seen by the Area Chair, who resolved divergences of opinionamong reviewers, if any. The Area Chair then wrote qualitative comments anda quantitative score along with his/her initial recommendation on the paper.These were looked at by Program Co-chairs and compiled into a probables list.The Area Chairs and Program Co-chairs met in Beijing during ICCV to discussthis list and arrived at the final list of 64 oral papers and 128 posters. Naturally,many deserving papers could not be accommodated.

Katsushi Ikeuchi has been unflinching in his support of ACCV as a wholeand ACCV 2006 in particular. His help was critical at many stages. We mustthank the Area Chairs and the reviewers for their time and effort towards theconference. From IIIT Hyderabad, C.V. Jawahar and Anoop M. Namboodiri con-tributed in many ways with the program. The enthusiastic team of students fromthe Centre for Visual Information Technology (CVIT) was behind it fully. Kar-teek Alahari, Kiran Babu Varanasi, Sumeet Gupta, Sukesh Kumar, and Satya-narayana made all the logistics of the CFP, paper submission, review process,and preparation of the proceedings really possible. The International Instituteof Information Technology was fully behind the conference as a team and de-serves our deep gratitude. Finally but most importantly we wish to thankthe authors who showed great enthusiasm for ACCV.

ACCV has been gaining in stature as a platform to showcase the best ofcomputer vision research over the years. We hope the 2006 edition has broughtit forward at least a little. Computer vision continues to be an exciting area andconferences like these provide the much needed light to many who will embarkon a journey down its path.

P J NarayananShree NayarHarry Shum

(Program Chairs)

Conference Committees

General ChairsNarendra Ahuja

University of Illinois & IIIT HyderabadTakeo Kanade

Carnegie Mellon UniversityTieniu Tan

Chinese Academy of Sciences

Program ChairsP.J. NarayananIIIT, HyderabadShree Nayar

Columbia UniversityHarry Shum

Microsoft Research Asia

Organizing ChairsC.V. Jawahar

IIIT, HyderabadSantanu Chaudhury

IIT, Delhi

Advisory CommitteeMasahiko Yachida, Osaka University

Eam Khwang Teoh, Nanyang Technological UniversityRoland Chin, Hong Kong University of Science and Technology

Wen-Hsiang Tsai, Chiao Tang UniversityDavid Suter, Monash University

Sang-Uk Lee, Seoul National UniversityKatsushi Ikeuchi, Tokyo University

B. L. Deekshatulu, University of HyderabadD. Dutta Majumdar, Indian Statistical Institute

B. N. Chatterjee, Indian Institute of Technology, Kharagpur

Area Chairs

Yaron Caspi Hebrew UniversityTat Jen Cham Nanyang Technological UniversityBhabatosh Chanda Indian Statistical InstituteSubhasis Chaudhuri Indian Institute of Technology, MumbaiYi-ping Hung National Taiwan UniversityPrem Kalra Indian Institute of Technology, DelhiChandra Kambhamettu University of DelawareMohan Kankanahalli National University of SingaporeIn So Kweon Korean Advanced Institute of Science and TechnologySang Wook Lee Sogang UniversityRavikanth Malladi GE John Welch Technology CentreHiroshi Murase Nagoya UniversityTomas Pajdla Czech Technical UniversityLong Quan Hong Kong University of Science and TechnologyA.N. Rajagopalan Indian Institute of Technology, MadrasMubarak Shah University of Central FloridaTakeshi Shakunaga Okayama UniversityDavid Suter Monash UniversityTanveer Syeda-Mahmood IBM Almaden Research CenterChi-Keung Tang Hong Kong University of Science and TechnologyXiaoou Tang Microsoft Research AsiaRin-ichiro Taniguchi Kyushu UniversityBaba Vemuri University of FloridaYaser Yacoob University of MarylandNaokazu Yokoya Nara Institute of Science and TechnologyChangshui Zhang Tsinghua UniversityZhengyou Zhang Microsoft Research, Redmond

Reviewers

Neeharika AdabalaManoj AggarwalAmir AkbarzadehYusuf AkgulKenichi ArakawaGreg ArnoldNaoki AsadaMark AshdownTarkan AydinNoboru BabaguchiSimon BakerHynek BaksteinAlok BandekarSubhashis BanerjeeMusodiq BelloKiran BhatRahul BhotikaPrabir Kumar BiswasMichael BrownSema CandemirLekha ChaisornKap Luk ChanMichael ChanSharat ChandranPeng ChangParag ChaudhuriDatong ChenChu-Song ChenXilin ChenYong-Sheng ChenJames CheongTat-Jun ChinOndrej ChumAntonio CriminisiShengyang Dai

Kristin DanaJames DavisAmadou DialloGianfranco DorettoLingyu DuanSumantra Dutta RoyRyan EckboAlexei EfrosHazim Kemal EkenelSabu EmmanuelChris EngelsMark EveringhamZhimin FanJan-Michael FrahmKazuhiro FukuiHui GaoTheo GeversChristopher GeyerJoshua GluckmanDmitry GoldgofGirish GopalakrishnanRalph GrossYanlin GuoKeiji GyohtenMei HanWang HanziManabu HashimotoJean-Yves HerveShinsaku HiuraJeffrey HoKi-Sang HongAnthony HoogsOsamu HoriKazuhiro HottaChangbo Hu

Gang HuaRui HuangSzu-Hao HuangDaniel HuberSei IkedaAli IskurtC.V. JawaharJiaya JiaSeon Joo KimIoannis KakadiarisAtul KanaujiaMasayuki KanbaraMoon Gi KangSing Bing KangMark KeckZia KhanRon KimmelKoichi KiseDan KongRavi KothariRyo KurazumeUday KurkureJames KwokShang-Hong LaiArvind LakshmikumarShihong LAOKyoung Mu LeeWee Kheng LeowMaylor LeungThomas LeungDahua LiLiyuan LiMin LiLin LiangChia-Te Liao

X Organization

Jenn-Jier James LienJoo-Hwee LimStephen LinChe-Bin LiuZhiheng LiuQingshan LiuTyng-Luh LiuXiaoming LiuZicheng LiuYogish MallyaJose MarroquinDaniel MartinecBogdan MateiYasuyuki MatsushitaScott McClosskeyPaulo MendoncaShabbir MerchantBranislav MicusikKarol MikulaJames MillerAnurag MittalDaisuke MiyazakiKooksang MoonYasuhiro MukaigawaDipti Prasad MukherjeeJayanta MukhopadhyayKartik Chandra

MuktinutalapatiRakesh MullickChristopher NafisAnoop NamboodiriSrinivasa NarasimhanKo NishinoDavid NisterNaoko Nitta

Takahiro OkabeShinichiro OmachiSean OMaleyTaragay OskiperJiazhi OuDirk PadfieldKannappan PalaniappanVladimir PavlovicShmuel PelegA.G. Amitha PereraMichael PhelpsCarlos PhillipsMarc PollefeysDaniel PooleyArun PujariKokku RaghuDeepu RajanSubrata RakshitSrikumar RamalingamRavi RamamoorthiVisvanathan RameshAnand RangarajanSohan RanjanCen RaoChristopher RasmussenAlex Rav-AchaSai RavelaJens RittscherJames RossAmit Roy-ChowdhuryHideo SaitoSubhajit SanyalAlessandro SartiImari SatoTetsu Sato

Tomokazu SatoYoichi SatoPeter SavadjievKonrad SchindlerAndrew SeniorErdogan SevilgenShiguang ShanYing ShanVinay SharmaZhang ShengSheng-Wen ShihIkuko Shimizu OkataniK.S. ShriramKaleem SiddiqiTerence SimSudipta SinhaJayanthi SivaswamyThitiwan SrinarkS.H. SrinivasanChristopher StaufferJesse StewartHenrik SteweniusSvetlana StolpnerPeter SturmAkihiro SugimotoRahul SukthankarQibin SunSrikanth

SuryanarayanananBharath Kumar SVRahul SwaminathanGokul SwamyKar-Han TanMing TangHai Tao

Organization XI

SriRam ThirthalaYing-Li TianPrithi TissainayagamGeorge TodericiShoji TominagaWai Shun Dickson TongPhilip TorrLorenzo TorresaniEmin TuranalpAmbrish TyagiSeiichi UchidaNorimichi UkitaAnton van den HengelRajashekar VenkatachalamSvetha VenkateshUlas VuralToshikazu WadaMeng WanHuan WangLiang WangShu-Fan WangChieh-Chih (Bob) WangZhizhou WangTomas WernerFrederick WheelerKwan-Yee Kenneth WongWoontack WooWen WuYihong WuYing WuJing XiaoJiangjian XiaoWei Xu

Yasushi YagiShuntaro YamazakiKazumasa YamazawaShuicheng YanHua YangMing YangChangjiang YangJie YangMing-Hsuan YangRuigang YangQingxiong YangJieping YeDit-Yan YeungTing YuXinguo YuJingyi YuAli ZandifarXiang ZhangHongming ZhangLi ZhangTao ZhaoWenyi ZhaoJiang Yu ZhengWei ZhouYongwei ZhuAndrew ZissermanLarry Zitnick

Table of Contents Part I

Camera Calibration

On Using Silhouettes for Camera CalibrationEdmond Boyer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Towards a Guaranteed Solution to Plane-Based Self-calibrationBenot Bocquillon, Pierre Gurdjos, Alain Crouzil . . . . . . . . . . . . . . . . . . 11

Plane-Based Calibration and Auto-calibration of a Fish-Eye CameraHongdong Li, Richard Hartley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Stereo and Pose

Stereo Matching Using Iterated Graph Cuts and Mean Shift FilteringJu Yong Chang, Kyoung Mu Lee, Sang Uk Lee . . . . . . . . . . . . . . . . . . . 31

Augmented Stereo PanoramasChien-Wei Chen, Li-Wei Chan, Yu-Pao Tsai, Yi-Ping Hung . . . . . . . 41

A Local Basis Representation for Estimating Human Pose fromCluttered Images

Ankur Agarwal, Bill Triggs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Alignment of 3D Models to Images Using Region-Based MutualInformation and Neighborhood Extended Gaussian Images

Hon-Keat Pong, Tat-Jen Cham . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Texture

The Eigen-Transform and ApplicationsAlireza Tavakoli Targhi, Eric Hayman, Jan-Olof Eklundh,Mehrdad Shahshahani . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Edge-Model Based Representation of Laplacian SubbandsMalay K. Nema, Subrata Rakshit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Fusion of Texture Variation and On-Line Color Sampling for MovingObject Detection Under Varying Chromatic Illumination

Chunfeng Shen, Xueyin Lin, Yuanchun Shi . . . . . . . . . . . . . . . . . . . . . . 90

XIV Table of Contents Part I

Combining Microscopic and Macroscopic Information for Rotation andHistogram Equalization Invariant Texture Classification

S. Liao, W.K. Law, Albert C.S. Chung . . . . . . . . . . . . . . . . . . . . . . . . . . 100

Poster Session 1

Gaussian Decomposition for Robust Face RecognitionFumihiko Sakaue, Takeshi Shakunaga . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Occlusion Invariant Face Recognition Using Selective LNMF BasisImages

Hyun Jun Oh, Kyoung Mu Lee, Sang Uk Lee,Chung-Hyuk Yim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Two-Dimensional Fisher Discriminant Analysis and Its Application toFace Recognition

Zhizheng Liang, Pengfei Shi, David Zhang . . . . . . . . . . . . . . . . . . . . . . . 130

Combining Geometric and Gabor Features for Face RecognitionP.S. Hiremath, Ajit Danti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Complex Activity Representation and Recognition by ExtendedStochastic Grammar

Zhang Zhang, Kaiqi Huang, Tieniu Tan . . . . . . . . . . . . . . . . . . . . . . . . . 150

Recognize Multi-people Interaction Activity by PCA-HMMsYing Wang, Xinwen Hou, Tieniu Tan . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Object Recognition Through the Principal Component Analysis ofSpatial Relationship Amongst Lines

B.H. Shekar, D.S. Guru, P. Nagabhushan . . . . . . . . . . . . . . . . . . . . . . . . 170

Shift-Invariant Image Denoising Using Mixture of Laplace Distributionsin Wavelet-Domain

B.S. Raghavendra, P. Subbanna Bhat . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Blind Watermarking Via Pixel Modification with Regular RuleYulin Wang, Jinxu Guo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Surface Interpolation by Adaptive Neuro-fuzzy Inference System BasedLocal Ordinary Kriging

Coskun Ozkan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

PCA-Based Recognition for Efficient InpaintingThommen Korah, Christopher Rasmussen . . . . . . . . . . . . . . . . . . . . . . . . 206

Table of Contents Part I XV

Texture Image Segmentation: An Interactive Framework Based onAdaptive Features and Transductive Learning

Shiming Xiang, Feiping Nie, Changshui Zhang . . . . . . . . . . . . . . . . . . . . 216

Image Segmentation That Merges Together Boundary and RegionInformation

Wei Wang, Ronald Chung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

Extraction of Main Urban Roads from High Resolution Satellite Imagesby Machine Learning

Yanqing Wang, Yuan Tian, Xianqing Tai, Lixia Shu . . . . . . . . . . . . . . 236

Texture Classification Using a Novel, Soft-Set Theory BasedClassification Algorithm

Milind M. Mushrif, S. Sengupta, A.K. Ray . . . . . . . . . . . . . . . . . . . . . . . 246

Learning Multi-category Classification in Bayesian FrameworkAtul Kanaujia, Dimitris Metaxas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

Estimation of Structural Information Content in ImagesSubrata Rakshit, Anima Mishra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

Automatic Moving Object Segmentation with Accurate BoundariesJia Wang, Haifeng Wang, Qingshan Liu, Hanqing Lu . . . . . . . . . . . . . 276

A Bottom up Algebraic Approach to Motion SegmentationDheeraj Singaraju, Rene Vidal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

A Multiscale Co-linearity Statistic Based Approach to RobustBackground Modeling

Prithwijit Guha, Dibyendu Palai, K.S. Venkatesh,Amitabha Mukerjee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

Motion Detection in Driving Environment Using U-V-DisparityJia Wang, Zhencheng Hu, Hanqing Lu, Keiichi Uchimura . . . . . . . . . . 307

Visual Surveillance Using Less ROIs of Multiple Non-calibrated CamerasTakashi Nishizaki, Yoshinari Kameda, Yuichi Ohta . . . . . . . . . . . . . . . . 317

A Novel Robust Statistical Method for Background Initialization andVisual Surveillance

Hanzi Wang, David Suter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Exemplar-Based Human Contour TrackingShiming Xiang, Feiping Nie, Changshui Zhang . . . . . . . . . . . . . . . . . . . . 338

XVI Table of Contents Part I

Tracking Targets Via Particle Based Belief PropagationJianru Xue, Nanning Zheng, Xiaopin Zhong . . . . . . . . . . . . . . . . . . . . . 348

Multiple-Person Tracking Using a Plan-View Map with Error EstimationKentaro Hayashi, Takahide Hirai, Kazuhiko Sumi,Koichi Sasakawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

Extrinsic Camera Parameter Estimation Based-on Feature Trackingand GPS Data

Yuji Yokochi, Sei Ikeda, Tomokazu Sato, Naokazu Yokoya . . . . . . . . . . 369

A Method for Calibrating a Motorized Object RigPang-Hung Huang, Yu-Pao Tsai, Wan-Yen Lo, Sheng-Wen Shih,Chu-Song Chen, Yi-Ping Hung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379

Calibration of Rotating Line Camera for Spherical ImagingTomoyuki Hirota, Hajime Nagahara, Masahiko Yachida . . . . . . . . . . . . 389

Viewpoint Determination of Image by Interpolation over Sparse SamplesBodong Liang, Ronald Chung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

Inverse Volume Rendering Approach to 3D Reconstruction fromMultiple Images

Shuntaro Yamazaki, Masaaki Mochimaru, Takeo Kanade . . . . . . . . . . . 409

Gaze Direction Estimation with a Single Camera Based on FourReference Points and Three Calibration Images

Shinjiro Kawato, Akira Utsumi, Shinji Abe . . . . . . . . . . . . . . . . . . . . . . . 419

3D Shape Recovery of Smooth Surfaces: Dropping the Fixed ViewpointAssumption

Yael Moses, Ilan Shimshoni . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

Stereo Matching by InterpolationBodong Liang, Ronald Chung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439

Novel View Synthesis Using Locally Adaptive Depth RegularizationHitesh Shah, Subhasis Chaudhuri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449

View Synthesis of Scenes with Multiple Independently TranslatingObjects from Uncalibrated Views

Geetika Sharma, Santanu Chaudhury, J.B. Srivastava . . . . . . . . . . . . . 460

Generating Free Viewpoint Images from Mutual Projection of CamerasKoichi Kato, Jun Sato . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470

Table of Contents Part I XVII

Video Synthesis with High Spatio-temporal Resolution Using MotionCompensation and Image Fusion in Wavelet Domain

Kiyotaka Watanabe, Yoshio Iwai, Hajime Nagahara,Masahiko Yachida, Toshiya Suzuki . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

Estimating Illumination Parameters in Real Space with Application toImage Relighting

Feng Xie, Linmi Tao, Guangyou Xu, Huijun Di . . . . . . . . . . . . . . . . . . 490

An Efficient Real Time Low Bit Rate Video CodecShikha Tripathi, R. Vikas, R.C. Jain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500

Employing a Fish-Eye for Scene Tunnel ScanningJiang Yu Zheng, Shigang Li . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

Automatically Building 2D Statistical Shapes Using the TopologyPreservation Model GNG

Jose Garca Rodrguez, Anastassia Angelopoulou,Alexandra Psarrou, Kenneth Revett . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519

Semi-metric Space: A New Approach to Treat Orthogonality andParallelism

Jun-Sik Kim, In So Kweon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529

Face Recognition

Boosting Multi-gabor Subspaces for Face RecognitionQingShan Liu, HongLiang Jin, XiaoOu Tang, HanQing Lu,SongDe Ma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539

A New Distance Criterion for Face Recognition Using Image SetsTat-Jun Chin, David Suter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549

Face-Voice Authentication Based on 3D Face ModelsGirija Chetty, Michael Wagner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559

Face Recognition Under Varying Illumination Based on MAPEstimation Incorporating Correlation Between Surface Points

Mihoko Shimano, Kenji Nagao, Takahiro Okabe, Imari Sato,Yoichi Sato . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569

Exploring Facial Expression Effects in 3D Face Recognition UsingPartial ICP

Yueming Wang, Gang Pan, Zhaohui Wu, Yigang Wang . . . . . . . . . . . 581

XVIII Table of Contents Part I

Vision Based Speech Animation Transferring with UnderlyingAnatomical Structure

Yuru Pei, Hongbin Zha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591

Variational Methods

A Level Set Approach for Shape Recovery of Open ContoursMin Li, Chandra Kambhamettu, Maureen Stone . . . . . . . . . . . . . . . . . . 601

Statistical Shape Models Using Elastic-String RepresentationsAnuj Srivastava, Aastha Jain, Shantanu Joshi,David Kaziska . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612

Minimal Weighted Local Variance as Edge Detector for Active ContourModels

W.K. Law, Albert C.S. Chung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622

A New Active Contour Model: Curvature Gradient Vector FlowJifeng Ning, Chengke Wu, Shigang Liu, Peizhi Wen . . . . . . . . . . . . . . . 633

Dynamic Open Contours Using Particle Swarm Optimization withApplication to Fluid Interface Extraction

M. Thomas, S.K. Misra, C. Kambhamettu, J.T. Kirby . . . . . . . . . . . . 643

Attractor-Guided Particle Filtering for Lip Contour TrackingYong-Dian Jian, Wen-Yan Chang, Chu-Song Chen . . . . . . . . . . . . . . . . 653

Tracking

Tracking with the Kinematics of Extremal ContoursDavid Knossow, Remi Ronfard, Radu Horaud,Frederic Devernay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664

Multiregion Level Set Tracking with Transformation Invariant ShapePriors

Michael Fussenegger, Rachid Deriche, Axel Pinz . . . . . . . . . . . . . . . . . . 674

Multi-view Object Tracking Using Sequential Belief PropagationWei Du, Justus Piater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684

Online Updating Appearance Generative Mixture Model for MeanshiftTracking

Jilin Tu, Hai Tao, Thomas Huang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694

Table of Contents Part I XIX

Geometry and Calibration

Theory and Calibration for Axial CamerasSrikumar Ramalingam, Peter Sturm, Suresh K. Lodha . . . . . . . . . . . . . 704

Error Characteristics of SFM with Erroneous Focal LengthLoong-Fah Cheong, Xu Xiang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714

Interpreting Sphere Images Using the Double-Contact TheoremXianghua Ying, Hongbin Zha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724

New 3D Fourier Descriptors for Genus-Zero Mesh ObjectsHongdong Li, Richard Hartley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734

Lighting and Focus

High Dynamic Range Global MosaicDae-Woong Kim, Ki-Sang Hong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744

Image-Based Calibration of Spatial Domain Depth-from-Defocus andApplication to Automatic Focus Tracking

Soon-Yong Park, Jaekyoung Moon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754

Effects of Image Segmentation for Approximating Object AppearanceUnder Near Lighting

Takahiro Okabe, Yoichi Sato . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764

Fast Feature Extraction Using Approximations to Derivatives withSummed-Area Images

Paul Wyatt, Hiroaki Nakai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776

Poster Session 2

Detecting Faces from Low-Resolution ImagesShinji Hayashi, Osamu Hasegawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787

Human Distribution Estimation Using Shape Projection Model Basedon Multiple-Viewpoint Observations

Akira Utsumi, Hirotake Yamazoe, Ken-ichi Hosaka, Seiji Igi . . . . . . . 797

Modelling the Effect of View Angle Variation on Appearance-BasedGait Recognition

Shiqi Yu, Daoliang Tan, Tieniu Tan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807

XX Table of Contents Part I

Gesture Recognition Using Quadratic CurvesQiulei Dong, Yihong Wu, Zhanyi Hu . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817

From Motion Patterns to Visual Concepts for Event Analysis inDynamic Scenes

Lun Xin, Tieniu Tan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826

Probabilistic Modeling for Structural Change InferenceWei Liu, Veronique Prinet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 836

Robust Occluded Shape RecognitionRonak Shah, Anima Mishra, Subrata Rakshit . . . . . . . . . . . . . . . . . . . . . 847

Interactive Contour Extraction Using NURBS-HMMDebin Lei, Chunhong Pan, Qing Yang, Minyong Shi . . . . . . . . . . . . . . . 858

Learning Parameter Tuning for Object ExtractionXiongcai Cai, Arcot Sowmya, John Trinder . . . . . . . . . . . . . . . . . . . . . . 868

Region-Level Motion-Based Foreground Detection with ShadowRemoval Using MRFs

Shih-Shinh Huang, Li-Chen Fu, Pei-Yung Hsiao . . . . . . . . . . . . . . . . . . 878

Waterfall Segmentation of Complex ScenesAllan Hanbury, Beatriz Marcotegui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 888

Markovian Framework for Foreground-Background-Shadow Separationof Real World Video Scenes

Csaba Benedek, Tamas Sziranyi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 898

Separation of Reflection and Transparency Using Epipolar Plane ImageAnalysis

Thanda Oo, Hiroshi Kawasaki, Yutaka Ohsawa, Katsushi Ikeuchi . . . 908

Fast Approximated SIFTMichael Grabner, Helmut Grabner, Horst Bischof . . . . . . . . . . . . . . . . . 918

Image Matching by Multiscale Oriented Corner CorrelationFeng Zhao, Qingming Huang, Wen Gao . . . . . . . . . . . . . . . . . . . . . . . . . 928

Surface Registration Using Extended Polar MapsElsayed E. Hemayed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 938

Multiple Range Image Registration by Matching Local Log-PolarRange Images

Takeshi Masuda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 948

Table of Contents Part I XXI

Incremental Mesh-Based Integration of Registered Range Images:Robust to Registration Error and Scanning Noise

Hong Zhou, Yonghuai Liu, Longzhuang Li . . . . . . . . . . . . . . . . . . . . . . . 958

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 969

Table of Contents Part II

Infinite Homography Estimation Using Two Arbitrary PlanarRectangles

Jun-Sik Kim, In So Kweon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Shape OrientabilityJovisa Zunic, Paul L. Rosin, Lazar Kopanja . . . . . . . . . . . . . . . . . . . . . . 11

How to Compute the Pose of an Object Without a Direct View?Peter Sturm, Thomas Bonfort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Dense Motion and Disparity Estimation Via Loopy Belief PropagationMichael Isard, John MacCormick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

A Real-Time Large Disparity Range Stereo-System Using FPGAsDivyang K. Masrani, W. James MacLean . . . . . . . . . . . . . . . . . . . . . . . . 42

Use of a Dense Surface Point Distribution Model in a Three-StageAnatomical Shape Reconstruction from Sparse Information forComputer Assisted Orthopaedic Surgery: A Preliminary Study

Guoyan Zheng, Kumar T. Rajamani, Lutz-Peter Nolte . . . . . . . . . . . . 52

Fisheye Lenses Calibration Using Straight-Line Spherical PerspectiveProjection Constraint

Xianghua Ying, Zhanyi Hu, Hongbin Zha . . . . . . . . . . . . . . . . . . . . . . . . 61

Robust Linear Auto-calibration of a Moving Camera from ImageSequences

Thorsten Thormahlen, Hellward Broszio, Patrick Mikulastik . . . . . . . . 71

Frame Rate Stabilization by Variable Resolution Shape Reconstructionfor On-Line Free-Viewpoint Video Generation

Rui Nabeshima, Megumu Ueda, Daisaku Arita,Rin-ichiro Taniguchi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Vision-Based Posing of 3D Virtual ActorsAmeya S.Vaidya, Appu Shaji, Sharat Chandran . . . . . . . . . . . . . . . . . . . 91

Super-Resolved Video Mosaicing for Documents Based on ExtrinsicCamera Parameter Estimation

Akihiko Iketani, Tomokazu Sato, Sei Ikeda, Masayuki Kanbara,Noboru Nakajima, Naokazu Yokoya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

XXIV Table of Contents Part II

Content Based Image and Video Retrieval Using Embedded TextChinmaya Misra, Shamik Sural . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Object Tracking Using Background Subtraction and Motion Estimationin MPEG Videos

Ashwani Aggarwal, Susmit Biswas, Sandeep Singh, Shamik Sural,A.K. Majumdar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Multi-camera Tracking of Articulated Human Motion Using Motionand Shape Cues

Aravind Sundaresan, Rama Chellappa . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Matching Gait Image Sequences in the Frequency Domain for TrackingPeople at a Distance

Ryusuke Sagawa, Yasushi Makihara, Tomio Echigo,Yasushi Yagi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Performance Evaluation of Object Detection and Tracking in VideoVasant Manohar, Padmanabhan Soundararajan, Harish Raju,Dmitry Goldgof, Rangachar Kasturi, John Garofolo . . . . . . . . . . . . . . . 151

Vehicle Detection Using Double Slit CameraShunji Katahara, Masayoshi Aoki . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

Automatic Vehicle Detection Using Statistical ApproachChi-Chen Raxle Wang, Jenn-Jier James Lien . . . . . . . . . . . . . . . . . . . . 171

A Handheld Projector Supported by Computer VisionAkash Kushal, Jeroen van Baar, Ramesh Raskar,Paul Beardsley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

FormPad: A Camera-Assisted Digital NotepadTanveer Syeda-Mahmood, Thomas Zimmerman . . . . . . . . . . . . . . . . . . . 193

Symmetric Color Ratio in Spiral ArchitectureWenjing Jia, Huaifeng Zhang, Xiangjian He, Qiang Wu . . . . . . . . . . . 204

A Geometric Contour Framework with Vector Field SupportZhenglong Li, Qingshan Liu, Hanqing Lu . . . . . . . . . . . . . . . . . . . . . . . . 214

Clustering Spherical Shells by a Mini-Max Information AlgorithmXulei Yang, Qing Song, Wenbo Zhang, Zhimin Wang . . . . . . . . . . . . . . 224

Clustering of Interval-Valued Symbolic Patterns Based on MutualSimilarity Value and the Concept of k-Mutual Nearest Neighborhood

D.S. Guru, H.S. Nagendraswamy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

Table of Contents Part II XXV

Multiple Similarities Based Kernel Subspace Learning for ImageClassification

Wang Yan, Qingshan Liu, Hanqing Lu, Songde Ma . . . . . . . . . . . . . . . 244

Detection and Applications

Boosted Algorithms for Visual Object Detection on Graphics ProcessingUnits

Hicham Ghorayeb, Bruno Steux, Claude Laurgeau . . . . . . . . . . . . . . . . . 254

Combining Iterative Inverse Filter with Shock Filter for BaggageInspection Image Deblurring

Guoqiang Yu, Jin Zhang, Li Zhang, Zhiqiang Chen,Yuanjing Li . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

Automatic Chromosome Classification Using Medial AxisApproximation and Band Profile Similarity

Jau Hong Kao, Jen Hui Chuang, Tsai Pei Wang . . . . . . . . . . . . . . . . . 274

Object Detection Using a Cascade of 3D ModelsHon-Keat Pong, Tat-Jen Cham . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Heuristic Pre-clustering Relevance Feedback in Region-Based ImageRetrieval

Wan-Ting Su, Wen-Sheng Chu, Jenn-Jier James Lien . . . . . . . . . . . . . 294

Biologically Motivated Perceptual Feature: Generalized RobustInvariant Feature

Sungho Kim, In So Kweon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

Statistics and Kernels

A Framework for 3D Object Recognition Using the Kernel ConstrainedMutual Subspace Method

Kazuhiro Fukui, Bjorn Stenger, Osamu Yamaguchi . . . . . . . . . . . . . . . . 315

An Iterative Method for Preserving Edges and Reducing Noise in HighResolution Image Reconstruction

Chanho Jung, Gyeonghwan Kim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

Fast Binary Dilation/Erosion Algorithm Using KernelSubdivision

Ajay Narayanan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

XXVI Table of Contents Part II

Fast Global Motion Estimation Via Iterative Least-Square MethodJia Wang, Haifeng Wang, Qingshan Liu, Hanqing Lu . . . . . . . . . . . . . 343

Kernel-Based Robust Tracking for Objects Undergoing OcclusionR. Venkatesh Babu, Patrick Perez, Patrick Bouthemy . . . . . . . . . . . . . 353

Adaptive Object Tracking with Online Statistical Model UpdateKaiYeuh Chang, Shang-Hong Lai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

Segmentation

Inducing Semantic Segmentation from an ExampleYaar Schnitman, Yaron Caspi, Daniel Cohen-Or,Dani Lischinski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

Super Resolution Using Graph-CutUma Mudenagudi, Ram Singla, Prem Kalra, Subhashis Banerjee . . . . 385

A Multiphase Level Set Based Segmentation Framework with PoseInvariant Shape Priors

Michael Fussenegger, Rachid Deriche, Axel Pinz . . . . . . . . . . . . . . . . . . 395

A Unified Framework for Segmentation-Assisted Image RegistrationJundong Liu, Yang Wang, Junhong Liu . . . . . . . . . . . . . . . . . . . . . . . . . . 405

Geometry and Statistics

Fusion of 3D and Appearance Models for Fast Object Detection andPose Estimation

Hesam Najafi, Yakup Genc, Nassir Navab . . . . . . . . . . . . . . . . . . . . . . . . 415

Efficient 3D Face Reconstruction from a Single 2D Image by CombiningStatistical and Geometrical Information

Shu-Fan Wang, Shang-Hong Lai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427

Multiple View Geometry in the Space-TimeKazutaka Hayakawa, Jun Sato . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

Detecting Critical Configuration of Six PointsYihong Wu, Zhanyi Hu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447

Robustness in Motion AveragingVenu Madhav Govindu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457

Table of Contents Part II XXVII

Signal Processing

Detection of Moving Objects by Independent Component AnalysisMasaki Yamazaki, Gang Xu, Yen-Wei Chen . . . . . . . . . . . . . . . . . . . . . . 467

OK-Quantization Theory and Its Relationship to Sampling TheoremYuji Tanaka, Takayuki Fujiwara, Hiroyasu Koshimizu,Taizo Iijima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479

Contour Matching Based on Belief PropagationShiming Xiang, Feiping Nie, Changshui Zhang . . . . . . . . . . . . . . . . . . . . 489

Key Frame-Based Activity Representation Using AntieigenvaluesNaresh P. Cuntoor, Rama Chellappa . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499

Fast Image Replacement Using Multi-resolution ApproachChih-Wei Fang, Jenn-Jier James Lien . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

Poster Session 3

Histogram Features-Based Fisher Linear Discriminant for FaceDetection

Haijing Wang, Peihua Li, Tianwen Zhang . . . . . . . . . . . . . . . . . . . . . . . 521

Perception Based Lighting Balance for Face DetectionXiaoyue Jiang, Pei Sun, Rong Xiao, Rongchun Zhao . . . . . . . . . . . . . . 531

An Adaptive Weight Assignment Scheme in Linear SubspaceApproaches for Face Recognition

Satyanadh Gundimada, Vijayan Asari . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

Template-Based Hand Pose Recognition Using Multiple CuesBjorn Stenger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551

Scalable Representation and Learning for 3D Object Recognition UsingShared Feature-Based View Clustering


Video Scene Interpretation Using Perceptual Prominence andMise-en-scene Features

Gaurav Harit, Santanu Chaudhury . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571

Smooth Foreground-Background Segmentation for Video ProcessingKonrad Schindler, Hanzi Wang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

XXVIII Table of Contents Part II

Efficient Object Segmentation Using Digital Matting for MPEG VideoSequences

Yao-Tsung Jason Tsai, Jenn-Jier James Lien . . . . . . . . . . . . . . . . . . . . 591

Background Segmentation Beyond RGBFredrik Kristensen, Peter Nilsson, Viktor Owall . . . . . . . . . . . . . . . . . . 602

Classification of Photometric Factors Based on PhotometricLinearization

Yasuhiro Mukaigawa, Yasunori Ishii, Takeshi Shakunaga . . . . . . . . . . . 613

Material Classification Using Morphological Pattern Spectrum forExtracting Textural Features from Material Micrographs

D. Ghosh, David C. Tou Wei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623

A Hierarchical Framework for Generic Sports Video ClassificationMaheshkumar H. Kolekar, Somnath Sengupta . . . . . . . . . . . . . . . . . . . . 633

Feature Detection with an Improved Anisotropic FilterMohamed Gobara, David Suter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643

Feature Selection for Image CategorizationFeng Xu, Yu-Jin Zhang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653

An Energy Minimization Process for Extracting Eye Feature Based onDeformable Template

Huachun Tan, Yu-Jin Zhang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663

Image Feature Detection as Robust Model FittingDengfeng Chai, Qunsheng Peng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673

Extraction of Salient Contours Via Excitatory-Inhibitory Interactionsin the Visual Cortex

Qiling Tang, Nong Sang, Tianxu Zhang . . . . . . . . . . . . . . . . . . . . . . . . . . 683

Identification of Printing Process Using HSV Colour SpaceHaritha Dasari, Chakravarthy Bhagvati . . . . . . . . . . . . . . . . . . . . . . . . . . 692

Spatiotemporal Density Feature Analysis to Detect Liver Cancer fromAbdominal CT Angiography

Yoshito Mekada, Yuki Wakida, Yuichiro Hayashi, Ichiro Ide,Hiroshi Murase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702

Fast Block Matching Algorithm in Walsh Hadamard DomainNgai Li, Chun-Man Mak, Wai-Kuen Cham . . . . . . . . . . . . . . . . . . . . . . 712

Table of Contents Part II XXIX

Skin Detection by Near Infrared Multi-band for Driver Support SystemYasuhiro Suzuki, Kazuhiko Yamamoto, Kunihito Kato,Michinori Andoh, Shinichi Kojima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722

Extracting Surface Representations from Rim CurvesHai Chen, Kwan-Yee K. Wong, Chen Liang, Yue Chen . . . . . . . . . . . . 732

Applying Non-stationary Noise Estimation to Achieve ContrastInvariant Edge Detection

Paul Wyatt, Hiroaki Nakai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742

Corner Detection Using Morphological Skeleton: An Efficient andNonparametric Approach

R. Dinesh, D.S. Guru . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752

Correspondence Search in the Presence of Specular Highlights UsingSpecular-Free Two-Band Images

Kuk-Jin Yoon, In-So Kweon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761

Stereo Matching Algorithm Using a Weighted Average of CostsAggregated by Various Window Sizes

Kanya Sasaki, Seiji Kameda, Atsushi Iwata . . . . . . . . . . . . . . . . . . . . . . 771

Pseudo Measurement Based Multiple Model Approach for RobustPlayer Tracking

Xiaopin Zhong, Nanning Zheng, Jianru Xue . . . . . . . . . . . . . . . . . . . . . 781

A Hierarchical Method for 3D Rigid Motion EstimationThitiwan Srinark, Chandra Kambhamettu, Maureen Stone . . . . . . . . . 791

Virtual Fashion Show Using Real-Time Markerless Motion CaptureRyuzo Okada, Bjorn Stenger, Tsukasa Ike, Nobuhiro Kondoh . . . . . . . 801

Space-Time Invariants for 3D Motions from Projective CamerasYing Piao, Jun Sato . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811

Detecting and Tracking Distant Objects at Night Based on HumanVisual System

Kaiqi Huang, Liangsheng Wang, Tieniu Tan . . . . . . . . . . . . . . . . . . . . . 822

Motion Guided Video Sequence SynchronizationDaniel Wedge, Du Huynh, Peter Kovesi . . . . . . . . . . . . . . . . . . . . . . . . . 832

Landmark Based Global Self-localization of Mobile Soccer RobotsAbdul Bais, Robert Sablatnig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842

XXX Table of Contents Part II

Self-calibration Based 3D Information Extraction and Application inBroadcast Soccer Video

Yang Liu, Dawei Liang, Qingming Huang, Wen Gao . . . . . . . . . . . . . . 852

Error Analysis of SFM Under Weak-Perspective ProjectionLoong-Fah Cheong, Shimiao Li . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862

General Specular Surface TriangulationThomas Bonfort, Peter Sturm, Pau Gargallo . . . . . . . . . . . . . . . . . . . . . 872

Dense 3D Reconstruction with an Uncalibrated Active StereoSystem

Hiroshi Kawasaki, Yutaka Ohsawa, Ryo Furukawa,Yasuaki Nakamura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 882

Surface-Independent Direct-Projected Augmented RealityHanhoon Park, Moon-Hyun Lee, Sang-Jun Kim,Jong-Il Park . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 892

Aspects of Optimal Viewpoint Selection and Viewpoint FusionFrank Deinzer, Joachim Denzler, Christian Derichs,Heinrich Niemann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 902

An Efficient Approach for Multi-view Face Animation Based on Quasi3D Model

Yanghua Liu, Guangyou Xu, Linmi Tao . . . . . . . . . . . . . . . . . . . . . . . . . 913

Hallucinating 3D FacesShiqi Peng, Gang Pan, Shi Han, Yueming Wang . . . . . . . . . . . . . . . . . . 923

High Quality Compression of Educational Videos UsingContent-Adaptive Framework

Ankush Mittal, Ankur Jain, Sourabh Jain,Sumit Gupta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933

Video Processing

Double Regularized Bayesian Estimation for Blur Identification inVideo Sequences

Hongwei Zheng, Olaf Hellwich . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943

A Multi-Layer MRF Model for Video Object SegmentationZoltan Kato, Ting-Chuen Pong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 953

Table of Contents Part II XXXI

Scene Interpretation: Unified Modeling of Visual Context byParticle-Based Belief Propagation in Hierarchical Graphical Model


Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 973

On Using Silhouettes for Camera Calibration

Edmond Boyer

MOVI - Gravir - INRIA Rhone-Alpes, Montbonnot, [email protected]

Abstract. This paper addresses the problem of camera calibration using objectsilhouettes in image sequences. It is known that silhouettes encode information oncamera parameters by the fact that their associated viewing cones should presenta common intersection in space. In this paper, we investigate how to evaluate cali-bration parameters given a set of silhouettes, and how to optimize such parameterswith silhouette cues only. The objective is to provide on-line tools for silhouettebased modeling applications in multiple camera environments. Our contributionswith respect to existing works in this field is first to establish the exact constraintthat camera parameters should satisfy with respect to silhouettes, and second toderive from this constraint new practical criteria to evaluate and to optimize cam-era parameters. Results on both synthetic and real data illustrate the interest ofthe proposed framework.

1 Introduction

Camera calibration is a necessary preliminary step for most computer vision applica-tions involving geometric measures. This includes 3D modeling, localization and navi-gation, am-ong other applications. Traditional solutions in computer vision are based onparticular features that are extracted and matched, or identified, in images. This articlestudies solutions based on silhouettes which do not require any particular patterns normatching or identification procedures. They represent therefore a convenient solutionto evaluate and improve on-line a camera calibration, without the help of any specificpatterns. The practical interest arises more specifically in multiple camera environmentswhich are becoming common due, in part, to recent evolutions of camera acquisitionmaterials. These environments require flexible solutions to estimate, and to frequentlyupdate, camera parameters, especially because often calibrations do not remain validover time.

In a seminal work on motion from silhouettes, Rieger [1] used fixed points on sil-houette boundaries to estimate the axis of rotation from 2 orthographic images. Thesefixed points correspond to epipolar tangencies, where epipolar planes are tangent to theobserved objects surface. Later on, these points were identified as frontier points in [2]since they go across the frontier of the visible region on a surface when the viewpoint iscontinuously changing. In the associated work, the constraint they give on camera mo-tion was used to optimize essential matrices. In [3], this constraint was established asan extension of the traditional epipolar constraint, and thus was called the generalizedepipolar constraint. Frontier points give constraints on camera motions, however theymust first be localized on silhouette boundaries. This operation appears to be difficult:

P.J. Narayanan et al. (Eds.): ACCV 2006, LNCS 3851, pp. 110, 2006.c Springer-Verlag Berlin Heidelberg 2006

2 E. Boyer

in [4] inflexions of the silhouette boundary are used to detect frontier points from whichmotion is derived, in [5] infinite 4D spaces are explored using random samples and in[6] contour signatures are used to find potential frontier points. All these approachesrequire frontier points to be identified on the silhouette contours prior to camera pa-rameter estimation. However such frontier points can not be localized exactly withoutknowing epipoles. As a consequence, only approximated solutions are usually obtainedby discrete sampling over a space of potential locations for frontier points or epipoles.We take a different strategy and bypass the frontier point localization by considering theproblem globally over sets of silhouettes. The interest is to transform a computationallyexpensive discrete search into an exact, and much faster, optimization over a continuousspace.

It is worth to mention also a particular class of shape-from-silhouette applicationswhich use turntables and a single camera to compute 3D models. Such model acqui-sition systems have received noticeable attention from the vision community [7, 8, 9].They are geometrically equivalent to a camera rotating in a plane around the scene. Thespecific constraints which result from this situation can be used to estimate all motionparameters. However, the associated solutions do not extend to general camera config-urations as assumed in this paper.

Our approach is based first on the study of the constraint that both silhouettes andcamera parameters must satisfy. We then derive two criteria: a quantitative smooth cri-terion in the form of a distance, and a qualitative discrete criterion, both being definedat any point inside a silhouette. This provides practical tools to qualitatively evaluatecalibrations, and to quantitatively optimize their parameters. It appears to be particu-larly useful in multiple camera environments where calibrations often change, and forwhich fast on-line solutions are required.

This paper is organized as follows. Section 2 recalls background material. Section 3precises constraints and respective properties of silhouettes, viewing cones and frontierpoints. Section 4 introduces the distance between viewing cones that is used as a geo-metric criterion. Section 5 introduces the qualitative criterion. Section 6 shows resultson various data before concluding in section 7.

2 Definitions

Silhouette: Suppose that a scene, containing an arbitrary number objects, is observedby a set of pinhole cameras. Suppose also that projections of objects in the images aresegmented and identified as foreground. O denotes then the set of observed objects andIO the corresponding binary foreground-background images. The foreground regionof an image i consists of the union of objects projections in that image and, hence, maybe composed of several unconnected components with non-zero genus. Each connectedcomponent is called a silhouette and their union in image i is denoted Si.

Viewing Cone: Consider the set of viewing rays associated with image points belong-ing to a single silhouette in Si. The closure of this set defines a generalized cone inspace, called viewing cone. The viewing cones delimiting surface is tangent to thesurface of the corresponding foreground object. In the same way that Si is possibly

On Using Silhouettes for Camera Calibration 3

epipolar plane

visual hull

silhouetteviewi

ng cone

viewpoint

frontier point

Fig. 1. A visual hull and 2 of its viewing cones

composed of unconnected components, the viewing cones of image i are possibly sev-eral distinct cones, one associated with each silhouette in Si. Their union is denoted Ci.Note that individual objects are not distinguished here.

Visual Hull: The visual hull [10] is formally defined as the maximum surface consistentwith all silhouettes in all images. Intuitively, it is the intersection of the viewing cones ofall images (see figure 1). In practice, silhouettes are delimited by 2D polygonal curves,thus viewing cones are polyhedral cones and since a finite set of images are considered,visual hulls are polyhedrons. Assume that all objects are seen from all image viewpointsthen:

VH(IO) =

iIO

Ci, (1)

is the visual hull associated with the set IOof foreground images and their viewingcones CiIO . If all objects O do not project onto all images, then the reasoning that fol-lows still applies to subset of objects and subsets of cameras which satisfy the commonvisibility constraint.

3 Geometric Consistency Constraint

In this section, the exact and optimal geometric consistency which applies with silhou-ettes is first established and its equivalence with more practical constraints is discussed.

3.1 Visual Hull Constraint

Calibration constraints are usually derived from geometric constraints reflecting geo-metric coherence. For instance, different image projections of the same feature shouldgive rise to the same spatial location with true camera parameters. In the case of silhou-ettes, and under the assumption that no other image primitives are available, the onlygeometric coherence that applies comes from the fact that all viewing cones shouldcorrespond to the same objects with true camera parameters. Thus:

O VH(IO),

and consequently by projecting in any image i:

Si Pi(VH(IO)), i IO,

4 E. Boyer

where Pi() is the oriented projection1 in image i. Thus, viewing cones should all inter-sect, and viewing rays belonging to viewing cones should all contribute to this intersec-tion. The above expression is equivalent to:

iIO

[Si Pi(VH(IO))] = , (2)

which says that the visual hull projection onto any image i should entirely cover thecorresponding silhouette Si in that image. This is the constraint that viewing conesshould satisfy with true camera parameters. It encodes all the geometric consistencyconstraints that apply with silhouettes and, as such, is optimal. However this expressionin its current form does not yield a practical cost function for camera parameters sinceall configurations leading to an empty visual hull are equally considered, thus mak-ing convergence over cost functions very uncertain in many situations. To overcomethis difficulty, viewing cones can be considered pairwise as explained in the followingsection.

3.2 Pairwise Cone Tangency

We can easily derive from the general expression (2) the pairwise tangency constraint.Substituting the visual hull definition (1) in (2):

(2)

iIO

[Si Pi(

jIO

Cj)] = .

Since projection is a linear operation preserving incidence relations:

(2)

iIO

[Si

jIO

Pi(Cj)] = .

Note that, in the above expression, the exact equivalence with (2) is lost since projectingviewing cone individually introduces depth ambiguities and, hence, does not ensure acommon intersection of all cones as in (2). By distributive laws:

(2)

(i,j)IOIO

[Si Pi(Cj)] = . (3)

Expression (3) states that all viewing cones of a single scene should be pairwisetangent. By pairwise tangent, it is meant that all viewing rays from one cone intersect theother cone, and reciprocally. This can be seen as the extension of the epipolar constraintto silhouettes (see figure 2). Note that this constraint is always satisfied by concentricviewing cones, for which no frontier points exist. Note also that if (3) and (2) are notstrictly equivalent, they are equivalent in most general situations.

1 i.e. a projection such that there is a one-to-one mapping between rays from the projectioncenter and image points.


S i S j

P(C ) P(C )

CjCi

Oj ii j

Fig. 2. Pairwise tangency constraint: silhouette Si is a subset of the viewing cone projectionPi(Cj) in image i

3.3 Connection with Frontier Points

A number of approaches consider frontier points and the constraints they yield on cam-era configurations. Frontier points are particular points which are both on the objectssurface and the visual hull, which project onto silhouettes in 2 or more images, andwhere the epipolar plane is tangent to the surface (see figure 1). They satisfy thereforewhat is called the generalized epipolar constraint [3]. They allow hereby projective re-construction when localized in images [5, 6]. The connection between the generalizedepipolar constraint and the pairwise tangency constraint (3) is that the latter implies theformer at particular frontier points. Intuitively, if two viewing cones are tangent thenthe generalized epipolar constraint is satisfied at extremal frontier points where viewinglines graze both viewing cones.

4 Quantitative Criterion

The pairwise tangency is a condition that viewing cones must satisfy to ensure that thesame objects are inside all cones. In this section, we introduce a distance function thatevaluates this condition.

4.1 Distances Between a Viewing Ray and a Viewing Cone

The distance function between a ray and a cone that we seek should preferably respectseveral conditions:

1. It should be expressed in a fixed metric with respect to the data, thus in the imagessince a 3D metric will change with camera parameters.

2. It should be a monotonic function of the respective locations of ray and cone.3. It should be zero if the ray intersect the viewing cone. This intersection, while

apparently easy to verify in the images, requires some care when epipolar geometryis used. Figure 3 depicts for instance a few situations where the epipolar line of aray intersects the silhouette, though the ray does not intersect the viewing cone.These situations occur because no distinction is made between front and back ofrays.

4. It should be finite in general so that situations in figure 3 can be differentiated.

6 E. Boyer

viewing cone sectionray

Fig. 3. A ray and the cross-section of the viewing cone in the corresponding epipolar plane. 3 ofthe situations where unoriented epipolar geometry will fail and detect intersections.

viewingcone

epipole

distance

epipolar arcepipolar plane

viewingray

apex

Fig. 4. The spherical image model: viewing rays project onto epipolars arcs on the sphere

In light of this, a fairly simple but efficient approach is to consider a spherical imagemodel instead of a planar model (see figure 4), associated to an angular metric. Thedistance from a ray to a viewing cone is then the shortest path on the sphere from theviewing cone to the ray projection. This projection forms an epipolar circle-arc on thesphere delimited by the epipole and the intersection of the ray direction with the sphere.The ray projection is then always the shortest arc between these 2 points, which cancoincide if the ray goes trough the viewing cone apex. Two different situations occurdepending on the respective positions of the ray epipolar plane and the viewing cone:

1. The plane intersects the viewing cone apex only, as in figure 4. The point on thecircle containing the epipolar arc and closest to the viewing cone must be deter-mined. If such point is on the epipolar arc then the distance we seek is its distanceto the viewing cone. Otherwise, it is the minimum of the distances between the arcboundary points and the viewing cone.

2. The plane goes through the viewing cone. The distance is zero in the case wherethe ray intersects the viewing cone section in the epipolar plane, and the shortestdistance between the epipolar arc boundary points and the viewing cone section inthe other case. This distance is easily computed using angles in the epipolar plane.

4.2 Distance Between 2 Viewing Cones

A distance function between a ray and a viewing cone has been defined in the previoussection, this section discusses how to integrate it over a cone. The distance between


angle

focal length

translation

Fig. 5. The distance between 2 viewing cones as a function of: (green) one focal length whichvaries in the range [f 0.4f, f +0.4f ], with f the true value; (blue) one translation parameter towhich is added from 0.4 to 0.4 of the camera-scene distance; (red) one Euler orientation anglewhich varies in the range [ 0.4, +0.4] with the true value. The filled points denote thelimit distances on curves above which the 2 cones do not intersect at all.

2 viewing cones is then simply defined by a double integration over the 2 concernedcones.

Recall that silhouettes and viewing cones are discrete in practice and thus defined bysets of contour points in the images and boundary rays in space. The simplest solutionconsists then in summing individual distances over boundary rays. Assume that rki isthe kth ray on the boundary of viewing cone Ci, and d(rki , Cj) = d

kij is the distance

between rki and Cj as defined in the previous section. Then the distance Dij between Ciand Cj is:

Dij =k

dkij +l

dlji = dij + dji. (4)

Remark that Dij = Dji but dij = dji. The above expression is easy to computeonce the distance function is established. It can be applied to all boundary viewing rays,however mainly rays on the convex hulls of silhouettes are concerned by the pairwisetangency constraint, we thus consider only them to improve computational efficiency.Figure 5 illustrates the distance Dij between 2 viewing cones of a synthetic body modelas a function of various parameters of one cones camera. This graph demonstrates thesmooth behavior of the distance around the true parameter values, even when the conesdo not intersect at all.

5 Silhouette Calibration Ratio

Following the quantitative criterion, we introduce a simple qualitative criterion whichevaluates how silhouettes contribute to the visual hull for a given calibration.

Recall that any viewing ray, from any viewing cone, should be intersected by allother image viewing cones, along an interval common to all cones. Let r be an intervalalong ray r intersected by viewing cones, and let us call N (r) the number of imagecontributing (image for which a viewing cone intersects r) inside that interval. Then

8 E. Boyer

the sum over the rays r:

r maxr (N (r)), should theoretically be equal to m(n1)if m rays and n images are considered. Now this criterion can be refined by consideringeach image contribution individually along a viewing ray. Let ir be an interval, alongray r, where image i contributes. Then the silhouette calibration ration Cr defined as:

Cr =1

m(n 1)2r

i

maxir

(N (i)), (5)

should theoretically be equal to 1 since each image should have at least one contributioninterval with (n 1) image contributions. This qualitative criterion is very useful inpractice because it reflects the combined quality of a set of silhouettes and of a set ofcamera parameters. Notice however that it can hardly be used for optimizations becauseof its discrete, and thus non-smooth, nature.

6 Experimental Results

The pairwise tangency presented in the previous section constraint camera parameterswhen a set of static silhouettes IO is known. For calibration, different sets IO shouldbe considered. They can easily be obtained, from moving objects for instance, as in [5].The distances between viewing cones are then minimized over the camera parameterspace through a least square approach:

IO = min

(i,j) IOIO

D2ij , (6)

where is the set of camera parameters to be optimized. IO is equivalent to a maximumlikelihood estimate of the camera parameters under the assumption that viewing raysare statistically independent. The above quantitative sum can be minimized by standardnon-linear methods such as Levenberg-Marquardt.

6.1 Synthetic Data

Synthetic sequences, composed of images with dimensions 300300, were used to testthe approach robustness. 7 cameras, with standard focal lengths, are viewing a runninghuman body. All camera extrinsic parameters and one focal length per camera, assum-ing known or unit aspect ratios, are optimized. Different initial solutions are tested byadding various percentages of uniform noise to the exact camera parameters. For thefocal lengths and the translation parameters, the noise amplitudes vary from 0% up to40% of the exact parameter value; for the pose angle parameters, the noise amplitudesvary from 0% up to 40% of 2. Figure 6 shows, on the left, the silhouette calibrationratios after optimization; and on the right, relative errors in the estimated camera pa-rameters after optimization using 5 frames per cameras. These results first validate thesilhouette calibration ratio as a global estimator for the quality of any calibration withrespect to silhouette data. Second, they show that using only one frame per camera isintractable in most situations. However, they prove also that using several frames, cali-bration can be recovered with a good precision even far from the exact solution. Other


1 frame

5 frames3 frames

Silh

ouet

te c

alib

ratio

n ra

tio

Calibration parameter noise (%)

Rel

ativ

e er

rors

focal lengthangletranslation

Calibration parameter noise (%)

Fig. 6. Robustness to the initial calibration: right, the silhouette calibration ratio; left, the relativeerrors in the estimated camera parameters for the 5 frame case: errors relative to the true valuefor the focal length, errors relative to the distance camera-scene for the translation parameter anderrors relative to for the angle parameter

experiments, not presented due to lack of space, show that adding a reasonable amountof noise to silhouette vertices, typically a 1 pixel Gaussian Noise, only slightly changesthese results.

6.2 Real Data

Our approach was also tested in a real environment with 6 firewire cameras viewing amoving person. A calibration obtained by optimizing an initial solution using knownpoints is available and will be considered as the ground truth. In the following experi-

Fig. 7. Top, one of the original image, the corresponding silhouette and the visual hull model ob-tained with ground truth calibration. Bottom, 3 models which correspond to calibrations obtainedwith our method and using respectively 1, 3 and 5 frames per camera.

10 E. Boyer

ments, we use the same initial solution for the calibration with viewing cones. As forthe synthetic case, all camera extrinsic parameters and one focal length per camera areoptimized. Figure 7 shows, on top, the input images and a visual hull model obtainedusing ground truth values for calibration. In the bottom, models obtained from the samesilhouettes, but using our approach with respectively 1, 3 and 5 frames per camera.Apart from a scale difference, not shown and due to the fact that fixed dimensions wereimposed for the ground truth solution, the 2 most-right models are very close to theground truth one.

7 Conclusion

We have studied the problem of estimating camera parameters using silhouettes. It hasbeen shown that, under little assumptions, all geometric constraints given by silhouettesare ensured by the pairwise tangency constraint. A second contribution of this paper isto provide a practical criterion based on the distance between 2 viewing cones. Thiscriterion appears to be efficient in practice since it can handle a large variety of cameraconfigurations, in particular when viewing cones are distant. It allows therefore multi-camera environments to be easily calibrated when an initial solution exists. The criterioncan also be minimized using efficient and fast non-linear approach. The approach istherefore also aimed at real time estimation of camera motions with moving objects.

References

1. Rieger, J.: Three-Dimensional Motion from Fixed Points of a Deforming Profile Curve.Optics Letters 11 (1986) 123125

2. Cipolla, R., strm, K., Giblin, P.: Motion from the Frontier of Curved Surfaces. In: Proceed-ings of 5th International Conference on Computer Vision, Boston (USA). (1995) 269275

3. Astrom, K., Cipolla, R., Giblin, P.: Generalised Epipolar Constraints. In: Proceedings ofFourth European Conference on Computer Vision, Cambridge, (England). (1996) 97108Lecture Notes in Computer Science, volume 1065.

4. Joshi, T., Ahuja, N., Ponce, J.: Structure and Motion Estimation from Dynamic Silhouettesunder Perspective Projection. In: Proceedings of 5th International Conference on ComputerVision, Boston (USA). (1995) 290295

5. Sinha, S., Pollefeys, M., McMillan, L.: Camera Network Calibration from Dynamic Silhou-ettes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Washington, (USA). (2004)

6. Furukawa, Y., Sethi, A., Ponce, J., Kriegman, D.J.: Structure from Motion for Smooth Tex-tureless Objects. In: Proceedings of the 8th European Conference on Computer Vision,Prague, (Czech Republic). (2004)

7. Fitzgibbon, A., Cross, G., Zisserman, A.: Automatic 3d model construction for turn-tablesequences. In: Proceedings of SMILE Workshop on Structure from Multiple Images in LargeScale Environments. Volume 1506 of Lecture Notes in Computer Science. (1998) 154170

8. Mendoca, P., Wong, K.Y., Cipolla, R.: Epipolar Geometry from Profiles under CircularMotion. IEEE Transactions on PAMI 23 (2001) 604616

9. Jiang, G., Quan, L., Tsui, H.: Circular Motion Geometry Using Minimal Data. IEEE Trans-actions on PAMI 26 (2004) 721731

10. Laurentini, A.: The Visual Hull Concept for Silhouette-Based Image Understanding. IEEETransactions on PAMI 16 (1994) 150162

Towards a Guaranteed Solution to Plane-BasedSelf-calibration

Benot Bocquillon, Pierre Gurdjos, and Alain Crouzil

Universite Paul Sabatier, IRIT-TCI, 118, route de Narbonne,31062 Toulouse, France

{bocquillon, gurdjos, crouzil}@irit.fr

Abstract. We investigate the problem of self-calibrating a camera, from multipleviews of a planar scene. By self-calibrating, we refer to the problem of simultane-ously estimate the camera intrinsic parameters and the Euclidean structure of one3D plane. A solution is usually obtained by solving a non-linear system via lo-cal optimization, with the critical issue of parameter initialization, especially thefocal length. Arguing that these five parameters are inter-dependent, we proposean alternate problem formulation, with only three d.o.f., corresponding to threeparameters to estimate. In the light of this, we are concerned with global opti-mization in order to get a guaranteed solution, with the shortest response time.Interval analysis provides an efficient numerical framework, that reveals to behighly performant, with regard to both estimation accuracy and time-consuming.

1 Introduction

The self-calibration of a camera consists in determining, either partially or completely,the metric properties of the camera and/or the scene, from a set of uncalibrated views.The principle of self-calibration is to use absolute entities as targets, geometricallyconstrained by some prior information about the internal or external parameters of thecamera. Absolute targets are abstract entities, located at infinity, encoding the Euclideanstructure (ES) of the considered d-dimensional space, with the characteristic propertyof being left invariant under similarities1 in d-space [1, 2]. In 3-space, the target is theabsolute conic (AC), which is a circle of imaginary radius on the plane at infinity .The AC has the well-known property that its image (IAC) is globally invariant undercamera motion, providing the camera internal parameters are constant. This is the start-ing point of numerous 3D self-calibration methods (see [1, chapter 19] for a review).On the basis of a projective reconstruction of the scene, 3D self-calibration determinesthe ES of the 3D space in terms of the AC and the plane at infinity, in projective coor-dinates. This can be achieved either separately, or simultaneously. In the latter case, theAC is treated as a rank-3 envelope of 3D planes, known as absolute dual quadric in theliterature. Assuming that the focal length is the only unknown, closed-forms and linearsolutions can be obtained e.g., as in [3].

The problem up to discussion in this work is the 2D (or plane-based) self-calibrationof a camera i.e., by observing a 3D plane , with regard to general camera motion. In

1 i.e., transformations preserving angles and changing distances in the same ratio.

P.J. Narayanan et al. (Eds.): ACCV 2006, LNCS 3851, pp. 1120, 2006.c Springer-Verlag Berlin Heidelberg 2006

12 B. Bocquillon, P. Gurdjos, and A. Crouzil

2-space, the self-calibration targets are the circular points (CP) that are two conjugatecomplex points of on the line at infinity, meeting all the circles of and the ACof . Since Triggs work [4], it is known that 2D self-calibration is possible, usingthe constraint that the image of the CP (ICP) lie on the IAC, only involving inter-viewhomographies induced by . Because no other (general) invariance of the ICP can beexhibited, very few 2D self-calibration methods have been reported [5, 6, 4], except forsome specific camera motion [7, 8]. Furthermore, contrary to 3D self-calibration, evenwith a simplified model of the camera, no closed-form or linear solution exist. Such aproblem, consisting in determining simultaneously the CP and the AC, is non-linear inessence. As stated in [4], the problem parameterization requires 4 d.o.f. for the ICP plus5 d.o.f. for the AC. A solution can be obtained via local optimization, from at least 5views, with the critical issue of parameter initialization, especially the focal length.

Our starting point is to reduce the number of parameters to estimate by using the factthat, since the CP lie on the AC, there is a redundancy in the problem parameterization.This inter-dependence of parameters in Triggs statement is a modeling constraint thathas no reason not to be exactly ensured. Actually, Triggs initially treated it as an equa-tion, which does not really make sense as we will argue later. That said, our contributionis to propose a new minimal parameterization of the 2D self-calibration problem, by in-troducing as target a degenerate conic envelope, consisting of the point-pair at whichthe AC meets the line at infinity i.e., consisting of the CP. Thanks to our propositions (1)and (2), we show that we only require to estimate the affine structure of the plane alongwith the internal parameters. This leads to a formulation with seven unknowns/d.o.f.instead of the nine initially mentioned in [4]. Assuming that the constant focal lengthis the sole unknown, only three parameters have to be estimated. This paves the wayfor finding a guaranteed solution to the problem as this small number of unknowns iswell adapted to the use of interval analysis [9]. Interval analysis has been widely usedin global optimization problems [10] and afford the guarantee that the global minimumhas been found. Interval analysis has been succesfully used to the 3D self-calibrationproblem [11]. It provides an efficient numerical framework, that reveals to be highlyperformant, with regard to both estimation accuracy and time-consuming.

This paper is structured as follows. First, starting with the basic 2D self-calibrationequations of [4], we explain how to obtain a minimal parameterization of the problemfrom which we derive a cost function. Second, we review the main rules of intervalanalysis and the global minimization scheme used here. Eventually, we give the resultsobtained with synthetic and real data and conclusions are drawn.

2 Minimal Parameterization of 2D Self-calibration

2.1 Foreword and Notations

Our problem is that of recovering the Euclidean structure (ES) of some 3D plane ,called world plane, seen in multiple views, for some uncalibrated camera. What is onlyassumed to be known is the inter-view homographies induced by .

Without any additional knowledge, this problem cannot be separated from that ofcalibrating the camera i.e., of recovering its intrinsic parameters. Stated together, these

Towards a Guaranteed Solution to Plane-Based Self-calibration 13

are then referred to as the plane-based self-calibration problem [4]. [5] describes analternative to [4]. We will give in 2.2 the link between these two constraints.

We use some MATLAB-like notations: 1 : n denotes the range 1, . . . , n. M(1:r,1:c)denotes the r c submatrix of M selected by the row range 1 : r and the column range1 : c. The notation M(:,1:c), resp. M(1:r,:), selects the first c (resp. r) columns, resp. rows,of M. We also define the canonical vectors:

e1 (1, 0, 0), e2 (0, 1, 0), e3 (0, 0, 1). (1)

The matrix [x] refers to the skew-symmetric, order-3, matrix, such that [x] y =x y, y R3. In this paper, we will make a heavily use of the equality [Tx] =det(T)T [x] T

1. The notation i always refers to the imaginary number1.

In the following we assume some basic results on projective geometry. These can befound in standard textbooks e.g., in [1, 2]. We remind the reader some essential notionsand establish some novel properties relevant to our work.

The image of the absolute conic (IAC) matrix satisfies = KK1, where K isthe calibration matrix [1, 5.1] that encodes the internal camera parameters, which is,in its more general form:

K

u u00 v v00 0 1

, (2)whereu, v represent the focal length in terms of pixel dimensions in the u, v directionrespectively, (u0, v0) are the principal point pixel coordinates and is the skew factor.

2.2 Plane-Based Self-calibration Equations

Let P denote the unknown (Euclidean) world-to-image homography, mapping entitiesof to their projections on the image plane , and let Hj be the known inter-viewhomography, induced by , from the current view to some view number j.

The (Regular) Plane-Based Self-calibration Equations. Rigorously, the ES of isgiven in terms of its imaged circular points (ICP) P (I), whereas the circular points(CP) I are, by definition [1, pp. 52-53], conjugate complex points at infinity in ,common to all of its circles. In any Euclidean representation, the CP have canonicalcoordinates e e1 ie2 = (1,i, 0), which are invariant under any 2D similarityS of i.e., e Se, where S R33 is the matrix of S. In image representation, thecoordinates of the ICP P (I), denoted by x x1 ix2, satisfy x PSe, whereP R33 is the matrix of P . Note that x only have four d.o.f., basically the eightd.o.f. of P minus the four d.o.f. of S.

The ICP are, by projective invariance, on the vanishing line, common to all imagedcircles, including the image of the absolute conic [1, pp. 81-83] of the plane at infinity.The IAC is the locus of all ICP (i.e., of all 3D planes) which entails that xx = 0,or equivalently (see [1, p. 211] for more details):

x1 x2 = 0 and x1 x1 x2 x2 = 0. (3)


In view number j, the constraint is described by xHj jHjx = 0, or:

x1 Hj jHjx2 = 0 and x

1 H

j jHjx1 x2 Hj jHjx2 = 0, (4)

where j is the matrix of the IAC in view number j and Hj is the matrix of Hj .

The Dual Plane-Based Self-calibration Equations. A (maybe) most intuitive para-meterization of the ES can also be given in terms of any (Euclidean) world-to-imagehomography P S, where S denotes an arbitrary 2D similarity. Indeed, by applying(P S)1 to the image plane, we get an Euclidean reconstruction of , P S beingreferred to as rectifying homography.

If we treat the ICP as a degenerate conic envelope i.e., as the assemblage of isotropiclines as tangents, we get a conic, referred to as the image of the conic dual to circularpoints (ICDCP) in [1, p.52], whose matrix is of the form:

C xx+ + x+x P(ee+ + e+e)P PS(ee+ + e+e)SP, (5)

where S R33 is the matrix of S. As ee+ + e+e diag(1, 1, 0), a rectifyinghomography can be obtained by the adequate factorization [1, pp.55-56] of C e.g.,based on the singular value decomposition (SVD), with singular values 1 2 > 0and 3 = 0:

C = UU X diag(1, 1, 0)X =[x1 x2

] [x1 x2

]. (6)

Therefore, the ICP can be specified in the form of x = U(:,1:2)(1:2,1:2) XSe.

Consequently, the constraints (4) can be put in the matrix form:[x1 x2

]Hj jHj

[x1 x2

] I22. (7)

We now highlight an interesting decomposition of the ICDCP matrix C. Basically,our aim is to put into equation the fact that the degenerate conic C consists of the twopoints at which the vanishing line v meets the IAC . Since the AC is a circle on theplane at infinity, these two points are the ICP.

Proposition 1. The ICDCP matrix satisfies the following decomposition:

C [v] [v] , (8)

where is the IAC matrix and v is the vanishing line vector.

Proof. Define [v] [v]. Clearly is rank-2, so as a conic envelope, consistsof two distinct points p, q i.e., pq + qp. Let us show these are the ICP. On theone hand, we see that v = 0 which implies that both p, q are on the vanishing line v.On the other hand, any line w = v, verifying ww = 0, passes either through p orq. Assume w contains p: this entails that v w p and so pp = 0. As a result,since is the locus of all ICP, p is one ICP of and q its conjugate.


Minimal Parameterization. As explained above, the ICP can be specified from C inthe form of x1 x2, with

[x1 x2

] U(:,1:2)

(1:2,1:2) obtained from (6).

In this work, we will need a formal expression of x1 and x2.

Proposition 2. Vectors x1, x2 satisfying (6), and so (7), can be written in the form of:[x1 x2

]

[[v] ek [v] [v] ek

], k {1, 2, 3}, (9)

where uv/Kv and ek is a canonical vector, as defined in (1).

The proof requires to remind the reader that the vanishing line can be written asv = Kn, where n is the unit normal to in the camera frame. Let us also define thecalibrated ICDCP C K1CK, where is a scalar such that C = [n] [n].

Proof. The singular values of C are {1, 1, 0} so ran(C) = ran([n]) and null(C) =null([n]). Thanks to the SVD theorem [12], we know that the matrix W R33,WW I3, such that C W diag(1, 1, 0)W, has the properties that ran(C) =span{w1,w2} and null(C) = span{w3}. As a result, we can compute:

w1 = [n]ek, w2 = [n]2ek, w3 = n, (10)

where w2 = w3 w1 = [w3]w1. Substituting Kv to n into (10), after some nor-malizations, we obtain (9).

The proposed form (9) offers an obvious advantage of minimal parameterization ofthe self-calibration problem. Substituting (9) into (4), there are now seven d.o.f. insteadof the nine in [4].

Link with Malis Constraint [5]. Introducing Hj K1j HjKj , the calibrated ICDCP,in the view number j, is Cj [nj ]

[nj ] Hj [n]

[n] H

j , where nj is the unit

normal to in the camera frame number j. Interestingly enough, since the singularvalues of Cj are {1, 1, 0}, those of Hj [n]

are also {1, 1, 0}, up to a scale factor. This

latter property is the theoretical foundation of the self-calibration constraints of [5].

2.3 Formulation of the Problem

Assume that the IAC is constant in the views i.e., j . Given N views, i.e. (N 1)inter-view homographies Hj , 2 j N , the self-calibration problem of a camera isthat of solving the system consisting of two equations (3) and 2(N 1) equations (4)for the p d.o.f. in the IAC matrix plus q in the ICP vectors. This is a non-linear, possi-bly constrained, problem which has, until now, been solved using iterative methods. Itrequires initial values which is a critical issue, already mentioned in [4].

Because of the proposed form (9) of ICP, compared to [4], our problem modelingonly exhibits seven unknowns instead of nine. However, there is no magic: With theproposed form, the equation (3), related to the key view, is implicitly satisfied, while,in [4], it is considered as an equation to be satisfied. We ask the question: do we have


to consider (3) as a constraint or as an equation? Since no input data i.e., no estimatedhomography is involved in (3), there is no logical reason for this equation not to be ex-actly satisfied. Actually, the nine parameters of [4] are not independent and must satisfythe additional constraint (3). More generally, with regard to the estimation of the ho-mography, from key view to some view number j, using feature correspondences, thereis no logical reason for assigning any error to the positions of the (arbitrary) features inthe key view.

As one can expected, there are no more than two constraints for the plane-basedself-calibration problem, but several ways of expressing them.

Simplified Camera Model. We investigate now the minimal parameterization of ICPunder the assumption of a simplified camera model. Let the calibration matrix be K =diag(, , 1), where represents the focal length in pixels. Let v(cos, sin ,),where is the orthogonal distance from the principal point to the vanishing line inpixels. This means that (9) can also be written in the form of (7), with:

[x1 x2

]=

2 + 2 sin cos2 + 2 cos sin

0 1

. (11)

3 Global Optimization Using Interval Analysis

3.1 Interval Analysis

Interval analysis (IA) is born about forty years ago [13]. Several good introductions toIA are available in [10, 9].

An interval is denoted by x = [x, x], where x and x are the lower bound and theupper bound of x respectively. Interval vectors are called boxes. If x and y are twointervals, then the four elementary operations are defined by x op y = {x op y |x x and y y} for op {+,,,}. By composing these operations, we cancompute an extension of the range of a function over an interval. For instance, if f(x) =x(x 1), then an extension of f over [1, 1] is f([1, 1]) = [1, 1]([1, 1] 1) =[1, 1][2, 0] = [2, 2], which necessarily contains the exact range [1/4, 2] of f .

3.2 IA-Based Global Optimization

The idea of using IA for global optimization has been investigated by many authors[10, 14], to cite a few. In recent years, IA-based global optimization has exhibitedmany successes in various domains. It has also been successfully applied to 3D self-calibration [11]. The problem is the following: Find the global minimum f of a smoothfunction f , f = min{f(x) | x D}, as well as the set of points for which it is ob-tained, X = {x D | f(x) = f}, where D is a box. IA-based global optimizationusually uses IA along with a branch and bound algorithm. Let X be the box represent-ing the search region and L a list of boxes to be processed. The basic scheme of themethod can be stated as follows:


1. Initialize L by placing the initial search region X0 in it.2. While L = do:

a Remove a box X from L .b Process X (rejecting, reducing, critical point existence, ...).c Subdivise X and i