computer vision

7
CSE/EE 486: Computer Vision I Computer Project Report # : Project 5 CAMSHIFT Tracking Algorithm Group #4: Isaac Gerg, Adam Ickes, Jamie McCulloch Date: December 7, 2003 A. Objectives i. Become familiar with object and feature tracking ii. Study the effects of motion and its effects on optical flow. iii. Study tracking techniques utilizing object hue. iv. Implement the CAMSHIFT algorithm (PDF of article from Intel OpenCV (294k) ) v. Become familiar with Matlab programming and the Image Processing Toolbox. B. Methods There are two 'M' file for this project. All coordinate references are defined as follows: -X ^ | | | +---------> Y Where the + is the top left hand corner of image. part1.m 1. Converts 14 image pairs of a video sequence to grayscale. 2. Computes the absolute difference between the pairs yielding 15 images. camshift.m 1. Implementation of the CAMSHIFT algorithm for tracking a hand in a video sequence. Executing this project from within Matlab At the command prompt enter: >>part1 >>camshift C. Results The video sequence analyzed in this experiment is located here . Results described in order following Methods section above.

Upload: engr-ebi

Post on 21-Jul-2016

9 views

Category:

Documents


2 download

DESCRIPTION

computer vision

TRANSCRIPT

Page 1: Computer Vision

CSE/EE 486: Computer Vision I

Computer Project Report # : Project 5

CAMSHIFT Tracking Algorithm

Group #4: Isaac Gerg, Adam Ickes, Jamie McCulloch

Date: December 7, 2003

A. Objectives

i. Become familiar with object and feature trackingii. Study the effects of motion and its effects on optical flow.iii. Study tracking techniques utilizing object hue.iv. Implement the CAMSHIFT algorithm (PDF of article from Intel OpenCV (294k))

v. Become familiar with Matlab programming and the Image Processing Toolbox.

B. MethodsThere are two 'M' file for this project.

All coordinate references are defined as follows:

-X

^|||+---------> Y

Where the + is the top left hand corner of image.

part1.m

1. Converts 14 image pairs of a video sequence to grayscale.2. Computes the absolute difference between the pairs yielding 15 images.

camshift.m

1. Implementation of the CAMSHIFT algorithm for tracking a hand in a video sequence.

Executing this project from within Matlab

At the command prompt enter:

>>part1

>>camshift

C. Results

The video sequence analyzed in this experiment is located here.

Results described in order following Methods section above.

Page 2: Computer Vision

The first 14 pairs of frames in the video sequence were absolutely differenced to study their effects. It appears that forthis video sequence, tracking using this method would not yield good results. The hand object is not "closed" in all thedifferenced scene. Furthermore, thresholding would have to be done to remove some of the background noise. This iscaused from small movements of the person and illumination and reflection differences.

In a tracking method such as this, one may want to utilize morpholoply techniques in an attempt to bound the object.Once and object is bound, its centroid can be computed. This enables us then to track the object. In busy scenes withmuch variation in movement and light, this method becomes very non-useful.

This method is really only useful for transient optical flows. If the pixels intensities do not change, no motion will be visiblein the difference frames.

Figure 1 - Absolute difference of 14 pairs of frames. Frames 0-15 utilized. The frames are in sequence fromleft to right across the row.

CAMSHIFT Algorithm

The CAMSHIFT algorithm is based on the MEAN SHIFT algorithm. The MEAN SHIFT algorithm works well on staticprobability distributions but not on dynamic ones as in a movie. CAMSHIFT is based principles of the MEAN SHIFT butalso a facet to account for these dynamically changing distributions.

CAMSHIFT's is able to handle dynamic distributions by readjusting the search window size for the next frame based onthe zeroth moment of the current frames distribution. This allows the algorithm to anticipate object movement to quicklytrack the object in the next scene. Even during quick movements of an object, CAMSHIFT is still able to correctly track.

The CAMSHIFT algorithm is a variation of the MEAN SHIFT algorithm.

CAMSHIFT works by tracking the hue of an object, in this case, flesh color. The movie frames were all converted to HSVspace before individual analysis.

CAMSHIFT was implemented as such:1. Initial location of the 2D search window was computed.2. The color probability distribution is calculated for a region slightly bigger than the mean shift search window.3. Mean shift is performed on the area until suitable convergence. The zeroth moment and centroid coordinates arecomputed and stored.4. The search window for the next frame is centered around the centroid and the size is scaled by a function of the zerothmovement.5. Go to step 2.

The initial search window was determined by inspection. Adobe Photoshop was used to determine its location and size.The inital window size was just big enough to fit most of the hand inside of it. A window size too big may fool the trackerinto tracking another flesh colored object. A window too small will mostly quickly expand to an object of constant hue,however, for quick motion, the tracker may lock on the another object or the background. For this reason, a hue thresholdshould be utilized to help ensure the object is properly tracked, and in the event that an object with mean hue not of thecorrectly color is being tracked, some operation can be performed to correct the error.

Page 3: Computer Vision

For each frame, its hue information was extracted. We noted that the hue of human flesh has a high angle value. Thissimplified our tracking algorithm as the probability that a pixel belonged to the hand decreased as its hue angle did. Huethresholding was also performed to help filter out the background make the flesh color more prominent in thedistributions.

The zeroth moment, moment for x, and moment for y were all calculated. The centroid was then calculated from thesevalues.

xc = M10 / M00; yc = M01 / M00

The search window was then shifted to center the centroid and the mean shift computed again. The convergencethreshold used was T=1. This ensured that we got a good track on each of the frames. A 5 pixel expansion in eachdirection of the search window was done to help track movement.

Once the convergent values were computed for mean and centroid, we computed the new window size. The window sizewas based on the area of the probability distribution. The scaling factor used was calculated by:

s = 1.1 * sqrt(M00)

The 1.1 factor was chosen after experimentation. A desirable factor is one that does not blow up the window size tooquickly, or shrink it too quickly. Since the distribution is 2D, we use the sqrt of M00 to get the proper length in a 1Ddirection.

The new window size was computed with this scaling factor. It was noted that the width of the hand object was 1.2 timesgreater than the height. This was noted and the new window size was computed as such:

W = [ (s) (1.2*s) ]

The window is centered around the centroid and the computation of the next frame is started.

Page 4: Computer Vision

Figure 2 - Probability distribution of skin. High intensity values represent high probability of skin. The searchwindow and centroid are also superimposed on each frame. The frames are in sequence from top to bottom

in each row. The frames displayed are 0, 19, 39, 59, 79.

Page 5: Computer Vision

Figure 3 - Actual frames from the movie. The search window and centroid are also superimposed on eachframe. The frames are in sequence from top to bottom in each row. The frames displayed are 0, 19, 39, 59, 79.

Figure 4 - Motion plots of centroid. The image on the right has the first frame of the movie superimposedon the plot.

Frame: Coordinates1: 111, 1812: 109, 1763: 106, 1734: 104, 1725: 110, 1686: 112, 1687: 115, 1668: 118, 1659: 118, 16610: 123, 16411: 126, 16712: 130, 16713: 131, 16814: 131, 17015: 134, 17516: 137, 17617: 137, 17618: 140, 18119: 144, 17620: 145, 17521: 146, 17722: 146, 17923: 145, 17924: 145, 180

Page 6: Computer Vision

25: 145, 18226: 145, 18427: 149, 18528: 151, 18929: 151, 19130: 150, 19431: 148, 19732: 149, 19533: 149, 20034: 143, 20535: 142, 20936: 144, 21137: 145, 21238: 144, 21639: 141, 21740: 142, 22241: 142, 22342: 140, 22443: 139, 23044: 140, 22945: 139, 23246: 138, 23347: 137, 23748: 136, 23649: 134, 23950: 132, 24151: 132, 24352: 129, 24553: 130, 24654: 130, 24855: 127, 25056: 123, 24957: 124, 24958: 124, 24959: 123, 25260: 120, 25561: 120, 25362: 119, 25363: 124, 25764: 120, 25765: 112, 25366: 111, 25267: 110, 25468: 109, 25469: 106, 25270: 109, 25171: 108, 24772: 106, 24573: 104, 24374: 108, 24275: 109, 23976: 108, 23777: 109, 23778: 109, 23579: 110, 22380: 109, 22381: 111, 22282: 111, 22183: 112, 21784: 113, 21585: 113, 21386: 113, 21087: 115, 20988: 116, 20789: 116, 20590: 114, 20491: 115, 20192: 119, 19893: 117, 19694: 120, 19595: 121, 192

Page 7: Computer Vision

96: 120, 19397: 123, 18998: 123, 18999: 124, 185

Figure 5 - Centroid coordinates of each frame.

SummaryAll results were as expected in the experiment.

D. Conclusions

Object tracking is a very useful tool. Object can be tracked many ways including by color or by other features.

Tracking objects by difference frames is not always robust enough to work in every situation. There must be a staticbackground and constant illumination to get great results. With this method, object can be tracked in only situations withtransient optical flow. If the pixel values don't change, no motion will be detected.

The CAMSHIFT is a more robust way to track an object based on its color or hue. It is based after the MEAN SHIFTalgorithm. CAMSHIFT improves upon MEAN SHIFT by accounting for dynamic probability distributions. It scales thesearch window size for the next frame by a function of the zeroth moment. In this way, CAMSHIFT is very robust fortracking objects.

There are many variables in CAMSHIFT. One must decide suitable thresholds and search window scaling factors. Onemust also take into account uncertainties in hue when there is little intensity to a color. Knowing your distributions wellhelps to enable one to pick scaling values that help track the correct object.

In any case, CAMSHIFT works well in tracking flesh colored objects. These object can be occluded or move quickly andCAMSHIFT usually corrects itself.

E. Appendix

Source Code

part1.mcamshift.m

Movies

Hand tracking RGB with centroid and search window.Hand tracking probability (hue) with centroid and search window.Original movie

The hand tracking movies have the follow format parameters:

Fps: 15.0000Compression: 'Indeo3'Quality: 75KeyFramePerSec: 2.1429

Automatically updated parameters:TotalFrames: 99Width: 320Height: 240Length: 0ImageType: 'Truecolor'CurrentState: 'Closed'

Time ManagementIsaac spent ten hours working on this project. Adam spent two hours working on this project. Jamie spent two hours