motion analysis for image enhancemenirani/papers/sequence_enhnc.pdf · motion analysis for image...
TRANSCRIPT
Motion Analysis for Image Enhancement�
Resolution� Occlusion� and Transparency�
Michal Iraniy Shmuel Peleg
Institute of Computer Science
The Hebrew University of Jerusalem
����� Jerusalem� ISRAEL
Abstract
Accurate computation of image motion enables the enhancement of image sequences� In sceneshaving multiple moving objects the motion computation is performed together with object segmen�tation by using a unique temporal integration approach�
After computing the motion for the di�erent image regions� these regions can be enhancedby fusing several successive frames covering the same region� Enhancements treated here includeimprovement of image resolution� �lling�in occluded regions� and reconstruction of transparentobjects�
� Introduction
We describe methods for enhancing image sequences using the motion information computed bya multiple motions analysis method� The multiple moving objects are �rst detected and tracked�using both a large spatial region and a large temporal region� and without assuming any temporalmotion constancy� The motion models used to approximate the motions of the objects are ��Dparametric motions in the image plane� such as a�ne and projective transformations� The motionanalysis is presented in a previous paper ��� ��� and will only be brie�y described here�
Once an object has been tracked and segmented� it can be enhanced using information fromseveral frames� Tracked objects can be enhanced by �lling�in occluded regions� and by improvingthe spatial resolution of their images� When the scene contains transparent moving objects� theycan be reconstructed separately�
�This research was supported by the Israel Academy of Sciences� Email address of authors� fmichalb�
pelegg�cs�huji�ac�il�yM� Irani was partially supported by a fellowship from the Leibniz Center�
Section � includes a brief description of a method used for segmenting the image plane intodi�erently moving objects� computing their motions� and tracking them throughout the imagesequence� Sections � �� and � describe the algorithms for image enhancement using the computedmotion information� Section presents a method for improving the spatial resolution of trackedobjects� Section � describes a method for reconstructing occluded segments of tracked objects� andSection � presents a method for reconstructing transparent moving patterns�
An initial version of this paper appeared in �����
�
� Detecting and Tracking Multiple Moving Objects
In this section we describe brie�y a method for detecting and tracking multiple moving objects inimage sequences� which is presented in detail in ���� Any other good motion computation methodcan be used as well� In this approach for detecting di�erently moving objects� a single motion is�rst computed� and a single object which corresponds to this motion is identi�ed and tracked� Wecall this motion the dominant motion� and the corresponding object the dominant object� Once adominant object has been detected and tracked� it is excluded from the region of analysis� and theprocess is repeated on the remaining image regions to �nd other objects and their motions�
When the image motion can be described by a ��D parametric motion model� and this modelis used for motion analysis� the results are very accurate at a fraction of a pixel� This accuracyresults from two features�
�� The use of large regions when trying to compute the ��D motion parameters�
�� Segmentation of the image into regions� each containing only a single ��D motion�
��� ��D Motion Models
��D parametric transformations are used to approximate the projected �D motions of objects onthe image plane� This assumption is valid when the di�erences in depth caused by the motions aresmall relative to the distances of the objects from the camera�
Given two grey level images of an object� I�x� y� t� and I�x� y� t� ��� it is assumed that�
I�x� p�x� y� t�� y� q�x� y� t�� t� �� � I�x� y� t�� ���
where �p�x� y� t�� q�x� y� t�� is the displacement induced on pixel �x� y� by the motion of the planarobject between frames t and t � �� It can be shown ���� that the desired motion �p� q� minimizesthe following error function at Frame t in the region of analysis R�
Err�t��p� q� �X
�x�y��R
�pIx � qIy � It��� ���
We perform the error minimization over the parameters of one of the following motion models�
�� Translation� � parameters� p�x� y� t� � a� q�x� y� t� � d� In order to minimize Err�t��p� q��its derivatives with respect to a and d are set to zero� This yields two linear equations in thetwo unknowns� a and d� Those are the two well�known optical �ow equations ��� ���� whereevery small window is assumed to have a single translation� In this translation model� theentire object is assumed to have a single translation�
�� A�ne� parameters� p�x� y� t� � a� bx� cy� q�x� y� t� � d� ex� fy� Deriving Err�t��p� q�with respect to the motion parameters and setting to zero yields six linear equations in thesix unknowns� a� b� c� d� e� f ��� ���
� Moving planar surface �a pseudo projective transformation�� � parameters ��� ���p�x� y� t� � a � bx � cy � gx� � hxy� q�x� y� t� � d � ex � fy � gxy � hy�� DerivingErr�t��p� q� with respect to the motion parameters and setting to zero� yields eight linearequations in the eight unknowns� a� b� c� d� e� f � g� h�
��� Detecting the First Object
When the region of support of a single object in the image is known� its motion parameters can becomputed using a multiresolution iterative framework � � �� �� � �� ���
Motion estimation is more di�cult in the common case when the scene includes several movingobjects� and the region of support of each object in the image is not known� It was shown in�� �� �� that in this case the motion parameters of a single object can be recovered accuratelyby applying the same motion computation framework �with some iterative extensions ��� ��� tothe entire region of analysis�
This procedure computes a single motion �the dominant motion� between two images� A seg�mentation procedure is then used �see Section ���� in order to detect the corresponding object �thedominant object� in the image� An example of a detected dominant object using an a�ne motionmodel between two frames is shown in Figure ��c� In this example� noise has a�ected strongly thesegmentation and motion computation� The problem of noise is overcome once the algorithm isextended to handle longer sequences using temporal integration �Section �� ��
��� Tracking Detected Objects Using Temporal Integration
The algorithm for the detection of multiple moving objects discussed in Section ��� can be extendedto track detected objects throughout long image sequences� This is done by temporal integration�where for each tracked object a dynamic internal representation image is constructed� This imageis constructed by taking a weighted average of recent frames� registered with respect to the trackedmotion� This image contains� after a few frames� a sharp image of the tracked object� and ablurred image of all the other objects� Each new frame in the sequence is compared to the internalrepresentation image of the tracked object rather than to the previous frame ��� ��� Following isa summary of the algorithm for detecting and tracking an object in an image sequence�
For each frame in the sequence �starting at t � �� do�
�� Compute the dominant motion parameters between the internal representation image ofthe tracked object Av�t� and the new frame I�t� ��� in the region M�t� of the trackedobject �see Section ����� Initially� M��� is the entire region of analysis�
�� Warp the current internal representation image Av�t� and current segmentation maskM�t� towards the new frame I�t� �� according to the computed motion parameters�
� Identify the stationary regions in the registered images �see Section ����� using the reg�istered mask M�t� as an initial guess� This will be the segmented region M�t��� of thetracked object in frame I�t� ���
�
a� b� c�
Figure �� An example of the evolution of an internal representation image of atracked object�a� Initially� the internal representation image is the �rst frame in the sequence�The scene contains four moving objects� The tracked object is the ball�b� The internal representation image after frames�c� The internal representation image after � frames� The tracked object �theball� remains sharp� while all other regions blur out�
�� Compute the updated internal representation image Av�t��� by warping Av�t� towardsI�t� �� using the computed dominant motion� and averaging it with I�t� ���
When the motion model approximates of the temporal changes of the tracked object well enough�shape changes relatively slowly over time in registered images� Therefore� temporal integration ofregistered frames produces a sharp and clean image of the tracked object� while blurring regionshaving other motions� Figure � shows an example of the evolution of an internal representationimage of a tracked rolling ball� Comparing each new frame to the internal representation imagerather than to the previous frame gives the algorithm a strong bias to keep tracking the sameobject� Since additive noise is reduced in the the average image of the tracked object� and sinceimage gradients outside the tracked object decrease substantially� both segmentation and motioncomputation improve signi�cantly�
In the example shown in Figure �� temporal integration is used to detect and track the domi�nant object� Comparing the segmentation shown in Figure ��c to the segmentation in Figure ��demphasizes the improvement in segmentation using temporal integration�
Another example of detecting and tracking objects using temporal integration is shown inFigure � In this sequence� taken by an infrared camera� the background moves due to cameramotion� while the car has another motion� It is evident that the tracked object is the background�as all other regions in the image are blurred by their motion�
��� Tracking Other Objects
After detecting and tracking the �rst object� attention is directed at other objects� This is done byapplying the tracking algorithm once more� this time to the rest of the image� after excluding the
�
a� b� c� d�
Figure �� Detecting and tracking the dominant object using temporal integra�tion�a�b� Two frames in the sequence� Both the background and the helicopter aremoving�c� The segmented dominant object �the background� using the dominant a�nemotion computed between the �rst two frames� Black regions are those ex�cluded from the dominant object�d� The segmented tracked object after a few frames using temporal integration�
�rst detected object from the region of analysis� The scheme is repeated recursively� until no moreobjects can be detected�
In the example shown in Figure �� the second dominant object is detected and tracked� Thedetection and tracking of several moving objects can be performed in parallel� with a delay of oneor more frame between the computations for di�erent objects�
a� b� c� d�
Figure � Detecting and tracking the dominant object in an infrared imagesequence using temporal integration�a�b� Two frames in the sequence� Both the background and the car are moving�c� The internal representation image of the tracked object �the background��The background remains sharp with less noise� while the moving car blurs out�d� The segmented tracked object �the background� using an a�ne motionmodel� White regions are those excluded from the tracked region�
a� b� c�
Figure �� Detecting and tracking the second object using temporal integration�a� The initial region of analysis after excluding the �rst dominant object �fromFigure �d��b� The internal representation image of the second tracked object �the car��The car remains sharp while the background blurs out�c� Segmentation of the tracked car after � frames�
��� Segmentation
Once a motion has been determined� we would like to identify the region having this motion� Tosimplify the problem� the two images are registered using the detected motion� The motion ofthe corresponding region is canceled after registration� and the tracked region is stationary in theregistered images� The segmentation problem reduces therefore to identifying the stationary regionsin the registered images�
Pixels are classi�ed as moving or stationary using local analysis� The measure used for theclassi�cation is the average of the normal �ow magnitudes over a small neighborhood of each pixel�typically a � neighborhood�� In order to classify correctly large regions having uniform intensity�a multi�resolution scheme is used� as in low resolution levels the uniform regions are small� Thelower resolution classi�cation is projected on the higher resolution level� and is updated accordingto higher resolution information when it con�icts the classi�cation from the lower resolution level�
� Improvement of Spatial Resolution
Once good motion estimation and segmentation of a tracked object are obtained� it becomes possibleto enhance the images of this object�
Restoration of degraded images when a model of the degradation process is given is an ill�conditioned problem ��� �� ��� � � ��� ���� The resolution of an image is determined by the physicalcharacteristics of the sensor� the optics� the density of the detector elements� and their spatialresponse� Resolution improvement by modifying the sensor can be prohibitive� An increase in thesampling rate could� however� be achieved by obtaining more samples of the imaged object from asequence of images in which the object appears moving� In this section we present an algorithmfor processing image sequences to obtain improved resolution of di�erently moving objects� This isan extension of our earlier method� which was presented in �����
While earlier research on super�resolution ���� ��� ��� ��� treated only static scenes and puretranslational motion in the image plane� we treat dynamic scenes and more complex motions� Thesegmentation of the image plane into the di�erently moving objects and their tracking� using thealgorithm mentioned in Section �� enables processing of each object separately�
The Imaging Model� The imaging process� yielding the observed image sequence fgkg� is mod�eled by� gk�m�n� � �k�h�Tk�f�x� y��� � �k�x� y�� � where
� gk is the sensed image of the tracked object in the kth frame�
� f is a high resolution image of the tracked object in a desired reconstruction view� Findingf is the objective of the super�resolution algorithm�
� Tk is the ��D geometric transformation from f to gk� determined by the computed ��D motionparameters of the tracked object in the image plane �not including the decrease in samplingrate between f and gk�� Tk is assumed to be invertible�
� h is a blurring operator� determined by the Point Spread Function of the sensor �PSF�� Whenlacking knowledge of the sensor�s properties� it is assumed to be a Gaussian�
� �k is an additive noise term�
� �k is a downsampling operator which digitizes and decimates the image into pixels and quan�tizes the resulting pixels values�
The receptive �eld �in f� of a detector whose output is the pixel gk�m�n� is uniquely de�ned by itscenter �x� y� and its shape� The shape is determined by the region of support of the blurring operatorh� and by the inverse geometric transformation Tk
��� Similarly� the center �x� y� is obtained byTk����m�n���
An attempt is made to construct a higher resolution image �f � which approximates f as accu�rately as possible� and surpasses the visual quality of the observed images in fgkg� It is assumedthat the acceleration of the camera while imaging a single image frame is negligible�
�
Reconstructed Image
SimulatedImagingProcess
SimulatedLow-resolutionImages Images
Low-resolutionObserved
ProcessImaging
Original Image
Compare simulated andobserved low-resolutionimages.
Figure �� Schematic diagram of the super resolution algorithm� Areconstructed image is sought such that after simulating the imagingprocess� the simulated low�resolution images are closest to the ob�served low�resolution images� The simulation of the imaging processis expressed by Equation �
The Super�Resolution Algorithm� The presented algorithm for creating higher resolutionimages is iterative� Starting with an initial guess f ��� for the high resolution image� the imaging
process is simulated to obtain a set of low resolution images fg���k g
K
k�� corresponding to the observed
input images fgkgKk��� If f ��� were the correct high resolution image� then the simulated images
fg���k g
K
k�� should be identical to the observed images fgkgKk��� The di�erence images fgk � g
���k g
K
k��are used to improve the initial guess by �backprojecting� each value in the di�erence images ontoits receptive �eld in f ���� yielding an improved high resolution image f ���� This process is repeatediteratively to minimize the error function
e�n� �
vuut �
K
KXk��
kgk � g�n�k k
�
�
The algorithm is described schematically in Figure ��
�
The imaging process of gk at the nth iteration is simulated by�
g�n�k � �Tk�f
�n�� � h� � s � �
where � s denotes a downsampling operator by a factor s� and � is the convolution operator� Theiterative update scheme of the high resolution image is expressed by�
f �n��� � f �n� ��
K
KXk��
Tk�����gk � g
�n�k � � s� � p
����
where K is the number of low resolution images� � s is an upsampling operator by a factor s� andp is a �backprojection� kernel� determined by h and Tk as explained below� The average takingin Equation ��� reduces additive noise� The algorithm is numerically similar to common iterativemethods for solving sets of linear equations ����� and therefore has similar properties� such as rapidconvergence �see next paragraph��
In Figure � the resolution of a car�s license plate was improved from �� frames�
Analysis and Discussion� We introduce exact analysis of the superresolution algorithm in thecase of deblurring� Restoring an image from K blurred images �taken from di�erent viewing posi�tions of the object�� with ��D a�ne transformations fTkg
Kk�� between them and the reconstruction
viewing position� and without increasing the sampling rate� This is a special case of superresolution�which is simpler to analyze� In this case the imaging process is expressed by�
g�n�k � Tk�f
�n�� � h
and the restoration process in Equation ��� becomes�
f �n��� � f �n� ��
K
KXk��
Tk����gk � g
�n�k � � p
�� ���
The following theorems show that the iterative super resolution scheme is an e�ective deblurringoperator �proofs are given in the appendix��
Theorem ��� The iterations of Equation ��� converge to the desired deblurred image f �i�e�� anf that ful�lls� �k gk � Tk�f� � h�� if the following condition holds�
k� � h � pk� ��
�K
PKk�� kTkk�
��
where � denotes the unity pulse function centered at ��� ���Remark� When the �D image motions of the tracked object consist of only �D translations androtations� then Condition �� reduces to k� � h � pk� � � �
Proof� see appendix�
Theorem ��� Given Condition ��� the algorithm converges at an exponential rate �the norm ofthe error converges to zero faster than qn for some � � q � ��� regardless of the choice of initialguess f ��� �
��
a� b�
c�
Figure � Improvement of spatial resolution using �� frames� Thesampling rate was increased by � in both directions�a� The best frame from the image sequence�b� The license plate magni�ed by � using bilinear interpolation�c� The improved resolution image�
Proof� see appendix�
It is important to note that the original high resolution frequencies may not always be fullyrestored� For example� if the blurring function is an ideal low pass �lter� and its Fourier transformhas zero values at high frequencies� it is obvious that the frequency components which have been�ltered out cannot be restored� In such cases� there is more than one high resolution image whichgives the same low resolution images after the imaging process� According to Theorem �� thealgorithm converges regardless of the choice of the initial guess� However� since there may be morethan one correct solution to the problem� the choice of the initial guess does determine which of thesolutions is reached� A good choice of the initial guess is the average of the registered low resolutionimages of the tracked object in the desired reconstruction view� f ��� � �
K
PKk�� Tk
���gk�� Such aninitial guess leads the algorithm to a smooth solution� which is usually a desired feature�
Another issue is the choice of the backprojection kernel p� Unlike h� which represents propertiesof the sensor �the PSF�� there is some freedom in the choice of p� p is chosen so that Condition ��
��
holds� The smaller k� � h � pk� is� the faster the algorithm converges �see proof of Theorem ����Ideally� if k� � h � pk� � �� then the algorithm converges in a single iteration� This� however�means that p is the inverse kernel of h� which may not exist �as h is a low pass �lter�� or whichmay numerically be unstable to compute� Permitting k� � h � pk� � � �but still within the boundsof Condition ���� allows p to be other than the exact inverse of h� and therefore increases thenumerical stability� but slows down the speed of convergence� In other words� there is a tradeo�between the stability of the algorithm and its speed of convergence� determined by the choice of p�
The algorithm converges rapidly �usually within less than � iterations�� and can be implementedon parallel machines� The complexity of the algorithm is low� O�KN logN� operations per itera�tion� where N is the number of pixels in the high resolution image f � and K is the number of lowresolution images� Since the number of iterations is very small� this is also a good estimate of thecomplexity of the entire algorithm� The algorithm can be implemented in real�time� as only simplearithmetic operations are involved in the computation�
��
� Reconstruction of Occlusions
When parts of a tracked object are occluded in some frames by another moving object� but theseparts appear in other frames� a more complete view of the occluded object can be reconstructed���� ��� The image frames are registered using the computed motion parameters of the trackedobject� and the occluded parts of that object are then reconstructed by temporally averaging graylevels of all pixels which were classi�ed as object pixels in the corresponding segmentation masks�This process ignores the pixels when they are occluded by another moving object in the foreground�and the missing regions will be reconstructed even if they are occluded in most frames�
In the example shown in Figure � the background of the image sequence �the room scene�was completely reconstructed� eliminating the walking girl from the scene� The background wasreconstructed in all frames� generating a new sequence with no trace of the moving girl�
�
a� b� c�
Figure � Reconstruction of occluded regions�a� Five frames from the sequence� The camera is panning� and a girlwalks from right to left� The girl appears in all frames and occludesparts of the background in each frame in the sequence�b� Segmentation� black regions are those excluded from the trackedbackground�c� Full reconstructions of the background in all frames� The girl iseliminated�
��
� Reconstruction of Objects in Transparent Motion
A region contains transparent motions if it contains several di�erently moving image patterns thatappear superimposed� For example� moving shadows� spotlights� re�ections in water� an objectviewed through another transparent object� etc� In this section� we present a method for isolatingand reconstructing tracked objects in transparent motion�
The presented scheme assumes additive transparency �such as in re�ections�� However� thisscheme could be applied also to cases of multiplicative transparency �as in moving shadows andviewing through a semi�transparent media� by using the logarithm operation� Taking the logarithmof the input images changes the multiplicative e�ects into additive e�ects� and once the tracking isdone� the exponent is taken to return to the original scale�
Previous analysis of transparency �� �� ��� ��� � � assumed constant motion over several suc�cessive frames� which excludes most sequences taken from an unstabilized moving camera� Somemethods �� ��� � � elegantly avoid the segmentation problem� They require� however� high orderderivatives �the order increases with the number of objects�� which make them sensitive to noisydata�
In our work we do not assume any motion constancy� We temporally integrate the imageframes rather than use temporal derivatives� This provides robustness and numerical stability tothe tracking algorithm� This approach not only tracks the moving transparent objects� but alsoreconstructs them�
Transparent motions yield several motion components at each point� and segmentation cannotbe used to isolate one of the transparent objects� In practice� however� due to varying imagecontrast� in many image regions one object is more prominent than other objects� and segmentationcan be used to extract pixels which support better a single motion in the region of analysis� We usethe temporal integration scheme described in Section �� to track the dominant transparent object�The temporal averaging restores the dominant transparent object in its internal representationimage� while blurring out the other transparent objects� making them less noticeable� Comparingeach new frame to the internal representation image of the tracked object rather than to the previousframe gives the algorithm a strong bias to keep tracking the same transparent object� as it is theonly object in the internal image that is still similar to its image in the new frame �Figure ���
For recovering the second transparent object� the temporal integration tracking technique isapplied once more to the sequence� after some delay� Let Av��t� denote the internal representationimage of the �rst transparent object� Starting at frame I�t�� the algorithm is applied only to pixelsfor which the value of jI�t��Av��t�j is high� This di�erence image has high values in regionswhich contain prominent features of transparent objects in I�t� that faded out in the internalrepresentation image Av��t�� and low values in regions which correspond to the �rst dominanttransparent object� Therefore� we use the values of the absolute di�erence image as an initial maskfor the search of the next dominant object in the temporal integration algorithm from Section �� �The tracking algorithm is applied once again to the original image sequence� and not to framedi�erences as in ��� Now that the algorithm tracks the second dominant object� the new internalrepresentation image Av��t� restores the second dominant transparent object� and blurs out theother transparent objects� including the �rst dominant object�
��
a� b�
c� d�
Figure �� Reconstruction of �transparent� objects�a�b� The �rst and last frames in a sequence� A moving tripod is re�ected inthe glass of a picture of �owers�c� The internal representation image of the �rst tracked object �the pictureof �owers� after �� frames� The picture of �owers was reconstructed� There�ection of the tripod faded out�d� The internal representation image of the second tracked object �the re�ectionof the tripod� after �� frames� The re�ection of the tripod was reconstructed�The picture of �owers faded out�
In Figure �� the reconstruction of two transparent moving objects in a real image sequence isshown�
�
� Concluding Remarks
Temporal integration of registered images proves to be a powerful approach to motion analysis�enabling human�like tracking of moving objects� Once good motion estimation and segmentationof a tracked object are obtained� it becomes possible to enhance the object images� Fusing informa�tion on tracked objects from several registered frames enables reconstruction of occluded regions�improvement of image resolution� and reconstruction of transparent moving objects�
�
APPENDIX
The appendix contains proofs of Theorems �� and ��� The following notations will be used�
� T denotes the transformation from the deblurred image f to a blurred image g�
� �T denotes the respective ��D a�ne transformation describing the geometric transformationof pixel coordinates from f to g� i�e��
�T�f�� �x� y� � f��T���x� y�
��
Remarks�
�� �T is assumed to be invertible�
�� It is easy to show from this de�nition that T is a linear transformation of f �
� Since �T is a ��D a�ne transformation� it can be expressed in matrix notation by�
�T�x� y� � �d�M �
�x
y
��
where �d is a �� � vector� and M is a �� � matrix�
� �MT denotes the linear transformation part of �T �on pixel coordinates�� which uses the matrixM only �without the displacement �d�� i�e��
�MT�x� y� �M �
�x
y
��
and respectively� MT is the linear transformation on images de�ned by�
�MT�f�� �x� y� � f��M��T�x� y�
��
� det�M� denotes the determinant of the matrix M �
In order to prove Theorems �� and ��� we introduce the following two lemmas�
Lemma �
���a kTk� � j det�M�j�
�
���b kT��k� ��
jdet�M�j�
�
���c kMTk� � j det�M�j�
�
���d kMT��k� ��
j det�M�j�
�
Remark� It follows from this lemma that whenever the a�ne transformation �T is limited to �Dtranslations and rotations� then kTk� � �� When the transformation contains a scaling by a factors� then kTk� � s�
��
Proof�
���a
kT �f�k� �
�Z �
��
Z �
��j �T �f�� �x� y�j� dx dy
��
�
�
�Z �
��
Z �
��jf��T���x� y�
�j� dx dy
��
�
�
�Z �
��
Z �
��jf�u� v�j� � j det�M�j du dv
��
�
� change of integral variables by �u� v� � �T���x� y�
� j det�M�j�
� � kfk� �
Therefore�
kTk�def� Supkfk��� �kT �f�k��
� Supkfk���
�j det�M�j
�
� � kfk��
� j det�M�j�
� � Supkfk��� �kfk��
� j det�M�j�
� �
���b
It is easy to show that�
T�� �x� y� � �T���x� y� � �M�� � �d � M�� �
�x
y
�� Therefore�
according to Lemma ��a� kT��k� � j det�M���j�
� � �
j det�M�j�
�
�
���c
Since �MT �x� y� �M �
�x
y
�� then according to Lemma ��a kMTk� � j det�M�j
�
� �
���d
It is easy to show that MT�� � MT�� � Therefore� according to Lemmas ��b and ��c�
kMT��k� ��
jdet�M�j�
�
�
Lemma �
T �f� � f�� ��
kTk�� � �T �f�� �MT �f���
where � denotes the convolution operator�
��
Proof�
�T �f�� �MT �f��� �x� y� �Z �
��
Z �
���T �f��� ��� � � �MT �f��� �x� �� y � � d� d
�
Z �
��
Z �
��f�
��T����� �
�� f���M��T �x� �� y � �
�d� d
�Z �
��
Z �
��f�
��T����� �
�� f���M��T �x� y�� �M��
T ��� ��d� d
�
Z �
��
Z �
��f�
��T����� �
�� f���T���x� y�� �T����� �
�d� d
�Z �
��
Z �
��f��� �� � f�
��T���x� y�� �� ��
�� j det�M�j d d�
� change of integral variables by �� �� � �T����� �
� j det�M�j � �f� � f��� �T���x� y��
� j det�M�j � �T �f� � f��� �x� y�
� kTk�� � �T �f� � f��� �x� y� � using Lemma ��a
Theorem ���
Let Tk denote the transformation from the deblurred image f to the blurred image gk The iterationsof Equation ��� converge to the desired deblurred image f �i�e�� an f that ful�lls� �k gk �Tk�f� � h�� if the following condition holds�
k� � h � pk� ��
�K
PKk�� kTkk�
��
where � denotes the unity pulse function centered at ��� ���
Proof� Mathematical manipulations on the left hand side of Equation ��� yield�
f �n��� � f �n� ��
K
KXk��
Tk����gk � g
�n�k � � p
�
��
K
KXk��
�f �n� � Tk
����gk � g
�n�k � � p
��
��
K
KXk��
Tk���Tk�f
�n�� � �gk � g�n�k � � p
�
��
K
KXk��
Tk���Tk�f
�n�� � gk � p� g�n�k � p
�
��
K
KXk��
Tk���Tk�f
�n�� � gk � p� Tk�f�n�� � h � p
�
��
��
K
KXk��
Tk���Tk�f
�n�� � �� � h � p� � gk � p�
Therefore�
f �n��� ��
K
KXk��
Tk���Tk�f
�n�� � �� � h � p� � gk � p�
��
is another way of expressing the iterative scheme de�ned by Equation ����
It is easy to show that the desired f is a �xed point of Equation ��� by replacing f �n��� andf �n� with f � and gk with Tk�f� � h � Therefore�
f ��
K
KXk��
Tk�� �Tk�f� � �� � h � p� � gk � p� � ���
We shall now show that limn�� f �n� � f �
kErr�n���k� � kf �n��� � fk�
� k�
K
KXk��
Tk���Tk�f
�n�� � �� � h � p� � gk � p��
�
K
KXk��
Tk�� �Tk�f� � �� � h � p� � gk � p� k� � using � and ��
� k�
K
KXk��
Tk���Tk�f
�n� � f� � �� � h � p��k�
� k�
K
KXk��
�jdet�Mk�j � �f
�n� � f� �MTk���� � h � p�
�k�
� using Lemma � and Lemma ��b
� k�f �n� � f� ��
K
KXk��
�jdet�Mk�j �MTk
���� � h � p��k�
� kf �n� � fk� ��
K
KXk��
�jdet�Mk�j � kMTk
���� � h � p�k��
� kf �n� � fk� ��
K
KXk��
�jdet�Mk�j � kMTk
��k�k� � h � pk��
� kErr�n�k� � k� � h � pk� ��
K
KXk��
�jdet�Mk�j � kMTk
��k��
� kErr�n�k� � k� � h � pk� ��
K
KXk��
�jdet�Mk�j �
�
jdet�Mk�j�
�
�� using Lemma ��d
� kErr�n�k� � k� � h � pk� ��
K
KXk��
jdet�Mk�j�
�
��
��� � unfolding the recursion
� kErr���k� �
�k� � h � pk� �
�
K
KXk��
jdet�Mk�j�
�
�n��
According to Condition ��� and Lemma ��a
k� � h � pk� ��
K
KXk��
jdet�Mk�j�
� � � �
therefore
limn��
�k� � h � pk� �
�
K
KXk��
jdet�Mk�j�
�
�n��
� � � ���
and thereforelimn��
kErr�n�k� � � �
This proves that limn�� f �n� � f �
Remark� When the ��D image motions of the tracked object consist of only ��D translations androtations� then Condition �� reduces to k� � h � pk� � �� The reason for this is that j det�Mk�j � �for such a�ne transformations �Tk� and therefore� according to Lemma ��a � kTkk� � ��
Theorem ���Given Condition ��� the algorithm converges at an exponential rate �the norm of the error convergesto zero faster than qn for some � � q � ��� regardless of the choice of initial guess f ��� �
Proof� Equation ��� con�rms the exponential speed of convergence� The proof of Theorem ��shows that limn�� kErr�n�k� � � regardless of the magnitude kErr���k� � and therefore thechoice of the initial guess f ��� does not a�ect the convergence�
References
��� G� Adiv� Determining three�dimensional motion and structure from optical �ow generated byseveral moving objects� IEEE Trans� on Pattern Analysis and Machine Intelligence� ���� ������� July �����
��� H�C� Andrews and B�R� Hunt� editors� Digital Image Restoration� Prentice Hall� ���
� � J�R� Bergen and E�H� Adelson� Hierarchical� computationally e�cient motion estimation al�gorithm� J� Opt� Soc� Am� A�� �� �� ����
��� J�R� Bergen� P� Anandan� K�J� Hanna� and R� Hingorani� Hierarchical model�based motionestimation� In European Conference on Computer Vision� pages � ����� Santa MargaritaLigure� May �����
��
��� J�R� Bergen� P�J� Burt� K� Hanna� R� Hingorani� P� Jeanne� and S� Peleg� Dynamic multiple�motion computation� In Y�A� Feldman and A� Bruckstein� editors� Arti�cial Intelligence andComputer Vision� Proceedings of the Israeli Conference� pages ������ Elsevier� �����
�� J�R� Bergen� P�J� Burt� R� Hingorani� and S� Peleg� Computing two motions from three frames�In International Conference on Computer Vision� pages �� �� Osaka� Japan� December �����
�� P�J� Burt� R� Hingorani� and R�J� Kolczynski� Mechanisms for isolating component patternsin the sequential analysis of multiple motion� In IEEE Workshop on Visual Motion� pages����� � Princeton� New Jersey� October �����
��� T� Darrell and A� Pentland� Robust estimation of a multi�layered motion representation� InIEEE Workshop on Visual Motion� pages � ���� Princeton� New Jersey� October �����
��� R�C� Gonzalez� Image enhancement and restoration� In T�Y� Young and K�S� Fu� editors�Handbook of Pattern Recognition and Image Processing� pages ������ � Academic Press� ����
���� B�K�P� Horn and B�G� Schunck� Determining optical �ow� Arti�cial Intelligence� �������� ������
���� T�S� Huang� editor� Image Enhancement and Restoration� JAI Press� ����
���� T�S� Huang and R�Y� Tsai� Multi�frame image restoration and registration� In T�S� Huang�editor� Advances in Computer Vision and Image Processing� volume �� pages �� �� JAIPress Inc�� �����
�� � R�A� Hummel� B� Kimia� and S�W� Zucker� Deblurring gaussian blur� Computer Vision�Graphics� and Image Processing� ������ ����
���� M� Irani and S� Peleg� Improving resolution by image registration� CVGIP� Graphical Modelsand Image Processing� � �� ��� �� May �����
���� M� Irani and S� Peleg� Image sequence enhancement using multiple motions analysis� In IEEEConference on Computer Vision and Pattern Recognition� Champaign� June �����
��� M� Irani� B� Rousso� and S� Peleg� Detecting and tracking multiple moving objects usingtemporal integration� In European Conference on Computer Vision� pages ������� SantaMargarita Ligure� May �����
��� M� Irani� B� Rousso� and S� Peleg� Computing occluding and transparent motions� To appearin International Journal of Computer Vision� ��� �
���� S�P� Kim� N�K� Bose� and H�M� valenzuela� Recursive reconstruction of high resolution imagefrom noisy undersampled multiframes� IEEE Transactions on Acoustics� Speech� and SignalProcessing� ������� ����� June �����
���� R�L� Lagendijk and J� Biemond� Iterative Identi�cation and Restoration of Images� KluwerAcademic Publishers� Boston�Dordrecht�London� �����
�
���� B�D� Lucas and T� Kanade� An iterative image registration technique with an application tostereo vision� In Image Understanding Workshop� pages ����� �� �����
���� M� Shizawa� On visual ambiguities due to transparency in motion and stereo� In EuropeanConference on Computer Vision� pages �������� Santa Margarita Ligure� May �����
���� M� Shizawa and K� Mase� Simultaneous multiple optical �ow estimation� In InternationalConference on Pattern Recognition� pages ������ Atlantic City� New Jersey� June �����
�� � M� Shizawa and K� Mase� Principle of superposition� A common computational framework foranalysis of multiple motion� In IEEE Workshop on Visual Motion� pages ������ Princeton�New Jersey� October �����
���� H� Shvayster and S� Peleg� Inversion of picture operators� Pattern Recognition Letters� ������������
���� H� Ur and Gross D� Improved resolution from subpixel shifted pictures� CVGIP� GraphicalModels and Image Processing� ���������� �����
��� J� Wang and E� Adelson� Layered representation for motion analysis� In IEEE Conference onComputer Vision and Pattern Recognition� pages �� � New York� June ��� �
��