temporal depth video enhancement based on intrinsic static ... · intrinsic static structure icip,...

Temporal Depth Video Enhancement Based On

Intrinsic Static Structure

Lu Sheng, King Ngi Ngan & Songnan Li

The Chinese University of Hong Kong

Outline2

Introduction & Related Work


Temporal Depth Video Enhancement

Experimental Results

Conclusion & Future Work

ICIP, Oct. 27-30, 2014, Paris, France

Introduction

In a raw depth video capturing a natural scene

Various complex and even unpredictable dynamic contents

As well as spatial and temporal artifacts

Artifacts in the depth video

Both in Spatial and temporal domain

Errors are caused by

Low resolution

Noise & outliers

Occlusions, material absorption & reflection

Distortion around object boundaries, and etc.


3

Raw Kinect video

Color-coded Raw TOF video

Introduction

After the spatial enhancement

Reduce artifacts in spatial domain

But not enough to deal with temporal flickering

No temporal consistency

Aggravate flickering artifacts


4

Raw Kinect video

Spatial JBF [1]

[1] J. Kopf et al, Joint bilateral upsampling, in ACM ToG., 2007.

Introduction

After the spatial enhancement

Reduce artifacts in spatial domain

But not enough to deal with temporal flickering

No temporal consistency

Aggravate flickering artifacts

After a conventional spatiotemporal enhancement

Still contain temporal flickering

Distort the necessary depth variation along dynamic objects

How to eliminate the temporal flickering while do not distort the necessary depth variation along the dynamic objects?


5

Coherent spatiotemporal JBF [2]

Spatial JBF

[2] N. Richardt, et al, Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos, Computer Graphics Forum, 2012

Related Work

Spatial Enhancement

Global optimization

J. Diebel et al., NIPS 2006

J. Yang et al., ECCV 2012

J. Park et al., ICCV 2011

and etc.

High dimensional Filtering

J. Kopf et al., TOG 2007

J. Dolson et al., CVPR 2010

B. Huhle et al., CVIU 2010

and etc.

Patch Match and etc.

J. Lu et al., CVPR 2013

L. Sheng et al., ICIP 2013, and etc.

6


Temporal Enhancement

Median filter

S. Matyunin et al., WSCG 2011

norm

S. H. Chan et al., TIP 2011

Texture consistency

Dongbo Min et al., TIP 2012

Deliang Fu et al., PCS 2010

Both texture and depth

N. A. Dodgson et al., CGF 2012

Model the scene

J. Shen et al., TIP 2013

J. Shen et al., CVPR 2013



7

A moving object

A static object

The static background

Kinect or another depth camera



8

A moving object

A static object



Captured depth map


Definition

A manifold lies on or behind the input depth map

Contains static structure of the captured scene


9

A moving object

A static object



Intrinsic static structure


Observation

Moving objects stay in its front

Static regions or visible background area are fused into it


10

A moving object

A static object



Intrinsic static structure


How to estimate the intrinsic static structure under an on-line fashion?

Temporal enhancement of the depth video?

11


A probabilistic generative model

An online variational probabilistic update scheme with

A robust exclusion of the dynamic foreground objects

the inclusion of the once occluded background

Robust static region detection by the intrinsic static structure

Enhance the static region of input depth frame

Dynamic objects own limited temporal consistency, we just omit it

Fusing the static region with the estimated intrinsic static structure



12

Camera center

Line of sight

Current intrinsic static structure

Behind the structure

Before the structure

is an indicate function that is equal to 1, when input argument is true and 0 vice visa

A Probabilistic Generative Model

If incoming depth belongs to

State-I: the intrinsic static structure

State-II: outliers in the front or moving objects

State-III: outliers rearward or revealedbackground



13

Camera center

Line of sight


State-I









14

Camera center

Line of sight


State-II









15

Camera center

Line of sight


State-III









The likelihood of w.r.t. the given intrinsic static structure

Gaussian prior over

Dirichlet prior over the frequency of each state


16

Camera center




The posterior


17

Camera center

is the set of previous depth samples

is the set of current samples

On-line Update Scheme


The posterior


18

Camera center

Updated intrinsic static structure

is the set of previous depth samples

is the set of current samples

If the input frame only contains the static scene and outliers, the updated intrinsic static structure will be governed by the posterior, and we have

Its probable depth is

The reliability of the model is

Variational approximation for efficiency



19

Variational approximation for a 1D depth signal

a) the probable depth values of the intrinsic static structure versus the raw depth signals

b) The confidence interval of the intrinsic static structure

c) The portions of three states

d) The approximated date distribution versus the histogram of the raw data

The nature of the input data samples is still well captured by the variational approximation



20

If the input frame contains the dynamic content, the posterior w.r.t. each state indicates the criterion to update the intrinsic static structure

State-I: fuse the new samples into the current modelState-II: leave the current model unchangedState-III: have a risk that current model is incorrect,

might require re-initialization

Camera center

State-III

State-I

State-II


Results after variational approximation are valid for State-I and State-II

Only need an additional criterion for State-III


Re-initialization to avoid incorrect estimation when

Ratio

It means the percentage of State-III is too large to provide enough confidence that current estimation is true

Increment

It means there are a successive frames that the states of input depth samples are State-III

Or texture consistency is violated at the locations of State-III

If RGB-D video is applicable. One step further, we can exploit the conditional random field to robustly find the re-initialized region


21

Camera center

State-III

State-I

State-II




22

Parameter initialization

When input depth sample is valid

No texture information is applicable

When input depth sample is valid, just copy the previous model

Temporal Depth Video Enhancement

Blend input depth frame/spatially enhanced depth frame with the estimated intrinsic static structure


23

Camera center


Temporal enhanced depth frame

The intrinsic static structure

The posterior of state-I

The temporally enhanced depth value

For RGB-D video, Both depth and texture information are helpful to fill depth holes

Inference the chance that the hole region belongs to the intrinsic static structure


Intrinsic static structure estimation – 1D case


24


Intrinsic static structure estimation – 2D case


25


Temporal enhancement on synthesis static scenes


26

Input disparity map is contaminated by

Per-frame Running time comparison (MALTAB Platform)


Temporal enhancement on static scenes


27

The reliability map indicates that most of the flat or smooth surfaces of the captured scene are of high reliability to be static

Around edges or highly slanted surfaces, the reliability map owns lower values

The reliability quantity


Temporal enhancement on static scenes


28


Temporal enhancement on dynamic scenes


29

Effectively and accurately find the moving objects according to the state classification, with the help of conditional random field and the registered color frame

Enhance the foreground region by Joint bilateral filter individually, fill in the holes in the static region by the estimated intrinsic static structure

Suppress noise and flickering artifacts at static regions


Temporal enhancement on dynamic scenes


30

Effectively and accurately find the moving objects according to the state classification, with the help of conditional random field and the registered color frame

Less motion delay will occur

(c) Coherent spatiotemporal JBF (d) Spatiotemporal WMF (e) Ours



31

(a) (b)

Spatiotemporal depth video enhancement

Conclusion & Future Work

Temporal flickering problem of conventional depth enhancement approaches

Introduction of the intrinsic static structures and its application to temporal enhancement of a depth video

Discuss the its potential to handle the static and dynamic scenes

Proper way to further combine spatial and temporal enhancement based on the intrinsic static structure into a whole framework

Research on a more general formulation of the static structure

How to deal with depth video captured by moving cameras, and etc.


32

33

Any Questions?



Variational Parameter Estimation

Factorize the posterior into independent Gaussian and Dirichlet distributions

The reliability of the model

The probable depth is

The posterior can be approximated by


34

Recursive estimation is possible!


Variational Parameter Estimation

Factorize the posterior into independent Gaussian and Dirichlet distributions

The posterior can be approximated by

Moment matching to estimate the hyperparameters


35

Closed-form

solutions!

temporal depth video enhancement based on intrinsic static ... · intrinsic static structure icip,...

Documents