project_report [pdf library]
TRANSCRIPT
Project Report on
SCALABILITY IN HETEOGENEOUS ENVIRONMENT
(VIDEO COMPRESSION)
By
DHYANESH DAMANIA
SUDARSHAN GOPINATH
RAJESH TURE
NILESH KARIA
Guided by
Dr. N.S.T. SAI
(External Guide)
Mr. VIKRANT AGASKAR
(Internal guide)
Department of Computer Engineering
Vidyavardhini’s College of Engineering and TechnologyK.T.Marg, Vasai Road
University of Mumbai2003 – 2004
TABLE OF CONTENTS
ACKNOWLEDGEMENTS i
ABSTRACT ii
1. INTRODUCTION
1.1 INTRODUCTION TO THE PROJECT 2
1.2 SOFTWARE REQUIREMENTS 4
1.3 HARDWARE REQUIREMENTS 5
1.4 AIM OF THE PROJECT 6
2. BRIEF DESCRIPTION OF THE PROJECT
2.1 ORIGINAL SCOPE OF THE PROJECT 8
2.2 EXPANDED SCOPE OF THE PROJECT 9
2.3 FUNDAMENTALS OF MOTION ESTIMATON AND 10
COMPENSATION
2.4 FULL SEARCH ALGORITHM 13
2.5 LOGARITHMIC SEARCH ALGORITHM 14
3. DESIGN DETAILS
3.1 STRUCURE OF THE ENCODER AND DECODER 16
3.2 COMPRESSED FILE HEADER FORMAT (.mcp FILE) 18
3.3 PROCESS FLOWCHARTS WITH DESCRIPTION 19
3.4 CLASS DIAGRAMS 27
3.5 INTERFACE DESIGN 32
4. TESTING RESULTS AND CONCLUSIONS
4.1 TEST PLATFORM 35
4.2 TEST CRITERIA AND VARIABLES 35
4.3 SELECTION OF TEST CASES 36
4.4 TEST RESULTS 39
4.5 SAMPLE FRAMES 49
4.6 CONCLUSIONS 52
5. APPLICATIONS AND FUTURE SCOPE
5.1 APPLICATIONS 55
5.2 FUTURE SCOPE 57
APPENDIX-A (THE AVI FILE HEADER FORMAT) 60
APPENDIX-B (THE BMP FILE HEADER FORMAT) 69
REFERENCES 78
i
ACKNOWLEDGEMENTS
Every successful endeavor is not the work of a single hand; it is the contribution of a number
of heads that make it complete. We take this opportunity to express our gratitude to our
professors who have not only given us support and guidance, but also pushed us to achieve
the best.
We express our gratitude to our external guide Dr. N.S.T. Sai, Senior Manager – Mahindra
British Telecom Ltd. for giving this opportunity to work under his expert guidance.
Our sincere thanks to Professor (Mrs.) Madhavi Pradhan, Head – Department of Computer
Engineering for consistently backing us in our work and guiding us through the course of the
project.
This project would have never set off from the launch pad it not been for Mr. Vikrant
Agaskar, our internal project guide who has spent a great deal of his invaluable time and
efforts structuring our project, time and again.
We would also like to thank the entire Computer Engineering Department for allowing us
spend endless hours in the lab and to bend the rules on a few occasions.
And last but not the least; we would like to thank our parents for supporting us and putting up
with our late night project sessions.
Dhyanesh Damania
Sudarshan Gopinath
Rajesh Ture
Nilesh Karia
ii
ABSTRACT
A video clip is essentially a series of still images arranged back to back. Therefore problems
of storage size faced by still images get multiplied when applied to video clips. This makes it
necessary to use some kind of compression technique to bring the storage requirements of a
video file to more acceptable levels.
One of the best methods for compressing video files is Motion Estimation and
Compensation. This method is based on the fact that very little motion occurs between two
consecutive frames of a video. Hence, in most cases it is sufficient to transmit only the
differences between the frames and a few reference frames. However, Motion Estimation
works best for scenes with a reasonable amount of motion. In general, video with heavy
motion such as sports video is hard to compress with Motion Estimation and Compensation.
This project “Scalability in Heterogeneous Environment (Video Compression)” involves the
use of two such Motion Estimation algorithms in the implementation of an AVI file
compression utility, namely the Full Search algorithm and the Logarithmic Search algorithm.
While the best algorithm with respect to accuracy is the Full Search method, its extremely
high complexity necessitates the design of faster Motion Estimation algorithms. The
Logarithmic Search method is a scaled down version of the Full Search algorithm and is less
computationally expensive. However, the increase in performance is obtained at the cost of
accuracy in the estimation process.
Performance evaluation carried out on both the algorithms show that the decreased accuracy
of the Logarithmic Search method if kept within limits does not have a significant impact on
the quality of the reconstructed video file. And hence for practical considerations the
Logarithmic Search is preferred over the Full Search algorithm.
The AVI file compression utility implemented in this project allows the use of either of the
two methods to achieve significant compression. The utility also has a built in feature that
allows the AVI file to be split into its component Bitmap format frames.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
1
1. INTRODUCTION
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
2
1.1 INTRODUCTION TO THE PROJECT
We live in a world of colour. The recent surge of high bandwidth communication has made
the possibility of supplementing audio transmission with video a reality. A plethora of
applications and services ranging from online video conferencing to streaming video on-
demand are possible.
The human eye has the property that when an image appears on the retina, the image is
retained for a few milliseconds before decaying. If a sequence of images is presented to the
human eye at 50 images per second, the eye does not notice that it is looking at distant
images. All video systems exploit this principle to produce moving pictures. Hence, a video
is essentially a representation of a two dimensional image as a function of time. The simplest
representation of a digital video is a sequence of frames, each consisting of a rectangular grid
of picture elements or Pixels.
Eight bits per pixel are commonly used to represent 256 gray levels. This scheme gives high
quality black and white video. For colour video, we use 8 bits for each of the RGB colours.
Using 24 bits per pixel we can obtain a colour range of about 16 million. To produce smooth
motion, digital video must display at least 25 frames per second. Smoothness of motion is
determined by the number of different images per second. The significance of these
parameters becomes significant for transmitting digital video over a network. Common
resolution configurations are 1024 X 768, 1280 X 960 and 1600 X 1200. Even the smallest of
these videos, with 24 bits per pixel and 25 frames per second, needs to be transmitted at 472
Mbps.
This obviously means that transmission of uncompressed video is completely out of question.
We therefore need compression algorithms to crunch the files to manageable limits, one for
compressing the data at the source and another for decompressing it at the destination;
namely encoding and decoding respectively. The encode/decode process need not be
invertible. That is, when compressing a file, transmitting it and then decompressing it, it is
usually acceptable to have the video signal after encoding and then decoding, slightly
different from the original.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
3
The encoding process makes use of an inherent property of the adjacent frames themselves:
redundancy. There is very little difference between the positions of the object under
consideration between consecutive frames. Video compression can be achieved using motion
estimation processes, still image compression or a combination of both these techniques.
Fig: Methods of Video Compression
While still image compression (intraframe compression) applied to the individual frames of a
video clip produce drastic compression we need further compression when considering video
transmission. And Motion Estimation and Compensation algorithms (interframe
compression) enhance these compression ratios even further.
This project focuses on the implementation of the Motion Estimation and Compensation
algorithms for achieving Video file compression.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
4
1.2 SOFTWARE REQUIREMENTS
GNU COMPILER COLLECTION (GCC):
GNU Compiler Collection contains compilers for C, C++, Objective-C, FORTRAN,
Java, and Ada, as well as libraries for these languages. It compiles and links code of
these languages to produce executable files.
Our project uses the C++ compiler and the libraries supported by GCC.
STANDARD TEMPLATE LIBRARY (STL):
The Standard Template Library, or STL, is a C++ library of container classes,
algorithms, and iterators; it provides many of the basic algorithms and data structures
of computer science. The STL is a generic library, meaning that its components are
heavily parameterized: almost every component in the STL is a template.
GNU IMAGE MANIPULATION PROGRAM (GIMP):
GIMP is used for viewing the intermediate results of the Motion Estimation and
Compensation process. Alternative picture viewing softwares capable of displaying
Device Independent Bitmaps and Run Length Encoded bitmaps can also be used.
THE GIMP TOOLKIT (GTK+):
GTK+ is a multi-platform toolkit for creating graphical user interfaces. GTK is
essentially an object oriented application programmer interface (API). GTK+ is free
software and part of the GNU Project. GTK+ is used to create a graphical user
interface for our project. We have used GTK+2.0 for our project.
GLADE:
Glade is a GUI designer for the Linux Platform based on the GTK+2.0 Library. Glade
provides drag and drop tools for design of the interface.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
5
1.3 HARDWARE REQUIREMENTS –
PROCESSOR:
Minimum: Pentium 200MHz
Recommended: Pentium III 400 MHz or better
HARD DISK SPACE:
Minimum: 1.7 GB
Recommended: 2.1 GB
MEMORY:
Minimum: 64 MB RAM
Recommended: 128 MB RAM
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
6
1.4 AIM OF THE PROJECT –
The primary focus of this project is the design and implementation of an AVI video file
compression and decompression utility using C++ and based on the Linux platform. The
utility will make use of the principle of Motion Estimation and Compensation. The utility
will provide the user with a detailed analysis of each compression operation.
The compression will be performed by using either of the following Motion Estimation and
Compensation algorithms based on the user’s choice:
Full Search Method
3-Step Logarithmic Search Method
Performance evaluation of the two algorithms will also be performed with respect to
Compression Ratio, Computation Times and Perceptual Quality of the regenerated video file.
The performance evaluation process aims to bring about the strength and weaknesses of the
two algorithms when applied to video files exhibiting objects with different degrees of
motion.
AVI files with the following attributes will be considered as test cases:
Motion occurring only in the foreground with the background relatively unchanged.
(Example: Video of a television news reader)
Foreground relatively constant with a moving background.
(Example: Video of a bouncing ball)
A high motion video in which both the foreground and background are in motion.
(Example: any sports video like a motorbike race)
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
7
2. BRIEF EXPLANATION OF THE PROJECT
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
8
2.1 ORIGINAL SCOPE OF THE PROJECT
The project involves the design and implementation of an AVI video file compression and
decompression utility. The compression algorithms used in the utility are based on the
principle of Motion Estimation and Compensation only.
No still image compression algorithms are applied to the frames of the video sequence. Thus,
the utility exclusively uses temporal redundancy removal techniques to achieve compression.
No attempt is made to take into account the spatial redundancy present in the video sequence.
Also, the utility does not take into account audio interleaved AVI files but focuses on pure
video data AVI files.
Encoder –
The input to the encoder will be the sequence of frames that constitute a video file.
The output of the encoder will be a compressed format file (.mcp) file.
Decoder –
The input to the decoder will be the compressed .mcp format file.
The output of the decoder will be the sequence of regenerated frames of the video
file.
The user will be given a choice of using the following Motion Estimation Algorithms to
perform compression:
Full Search Algorithm
Logarithmic Search
At the end of the compression process statistical data regarding the compression ratio,
computation time etc. will be provided to the user.
The primary concern of the project is the comparison of the two algorithms based on various
performance criteria like compression ratio, computation time and perceptual quality.
Appropriate test files will be used to evaluate the algorithms.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
9
2.2 EXPANDED SCOPE OF THE PROJECT
AVI FILE SUPPORT –
The project has been extended to support entire AVI files instead of individual video frames.
The following AVI file formats are accepted as input:
Raw AVI files (no inherent compression)
RLE encoded AVI files
Support for AVI files of the above types which use 8-bit and 24-bit colour representation has
also been designed into the utility.
Encoder –
The encoder now accepts as input AVI files, with the process of splitting the
file into frames taken up by the AVI splitter module.
Output of the encoder will be the compressed .mcp file.
Decoder –
The decoder accepts as input a .mcp file.
The output of the decoder will be a regenerated AVI file instead of individual
video frames.
AVI FILE SPLITTER –
An additional feature for splitting the source AVI file into its component frames has been
included into the utility. This module extracts the component frames from the AVI file and
stores them in the BMP format.
FRAME BROWSER –
By using the built-in Frame Browser, the user will have the option of being able to view the
frames at different stages of the compression / decompression process. The user will be able
to view the intermediary frames which include Motion Estimated and Motion Compensated
Images at the decoder end. This will eliminate the need for a third party image viewing
software.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
10
2.3 FUNDAMENTALS OF MOTION ESTIMATION AND COMPENSATION
Motion Estimation and Compensation are the fundamental methods used to compress video
and are used in various codecs such as MPEG-1, 2, and 4, H.261, H.263, H.263+ etc. The
goal of Motion Estimation and Compensation is to exploit the temporal redundancy (i.e.
redundancy present between consecutive frames) within an image sequence for optimum
compression.
The process of computing changes among frames by establishing correspondence between
frames is referred to as temporal prediction with motion compensation. Motion compensation
is preceded by Motion Estimation, the process of finding the corresponding pixels between
the frames.
MOTION ESTIMATION
A way to exploit the motion of an object to achieve image compression is to partition the
image into blocks called Macroblocks. Given a reference picture and an N × M Macroblock
in a current picture, the objective of motion estimation is to determine the N × M block in the
reference picture that better matches the characteristics of the Macroblock in the current
picture. The process of matching is done according to a predefined criterion called Mean
Absolute Error.
In many cases this reference block will be the same block (no motion). In some cases this
reference block will be a different block (motion). To simplify the process, only the
translatory motion model is assumed for objects in the scene and thus a rectangular geometry
is sufficient.
In general, the coordinates (x, y) of the Macroblock are given by its left top corner.
Considering practical limitations, we restrict the search to find a match to [-p, p] search
region around the original location of our Macroblock in the current picture. This process of
finding a suitable match is illustrated in the following sequence of diagrams.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
11
Fig 1 .Motion estimation process
(a) Current Picture (b) Reference Picture
Motion Vector (u,v)
(c) Reference picture with (d) Search region definition
Motion Vector calculation
Let (x + u, y + v) be the location of the best matching block in the reference picture. Our goal
is to determine the effective displacement of the matching block from its position in the
previous frame. This process is termed as calculation of a motion vector (MV).
Compression is achieved by sending or storing only the motion vector (and a possible small
error) instead of the pixel values for the entire block.
MOTION COMPENSATION
Motion Compensation follows Motion Estimation and is defined as the correction necessary
to compensate for errors introduced due to the Estimation Process. This is necessary as
Estimation is basically a process of approximation.
(x,y) (x,y)
-pp
p-p
N
M
N
M
Macroblock
p
-p
p
-p
+
+
_
_
BestMatch
(x,y) N
M
(x + u, y + v)
[-p, p] Search Region
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
12
If a temporal redundancy reduction processor employs motion compensation, then we can
express its output, the difference error as:
e (x, y, t) = I(x, y, t) – I (x – u, y – v, t - 1)
where –
I(x, y, t) are pixel values at spatial location (x, y) in frame (t) and I(x – u, y – v, t – 1) are
corresponding pixel values at spatial location (x – u, y – v) in frame (t-1). The output of the
motion estimator, the co-ordinates (u, v), defines the relative motion of a block from one
frame to another and is referred to as the “Motion Vector (MV)” for block at (x, y).
I(x – u, y – v, t – 1) is referred to as the motion compensated prediction of I(x, y, t) and e(x,
y, t) is the prediction of the residual for I(x, y, t).
The motion Vectors along with the compensated values are used for reconstructing the
frames at the receiver end.
THE MATCHING CRITERION –
Let the pixels of the Macroblock in the current frame be denoted as C (x + k, y + l) and the
pixels in the reference picture be denoted as R (x + i + k, y + j + l). We define a cost
function:
MAE (i , j) = 1 | C (x + k, y + l) - R (x + i + k, y + j + l) |
where i and j are defined in -p ≤ i ≤ p and -p ≤ j ≤ p
This is referred to as Mean Absolute Error (MAE) or Mean Absolute Difference (MAD)
criterion.
MN k = 0 l = 0
M -1 N -1
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
13
2.4 THE FULL SEARCH ALGORITHM
The Full Search algorithm is the most fundamental and most obvious method to perform the
process of Motion Estimation.
It is based on the principle of comparing each and every possible location within the
predefined search area for a possible match. In this algorithm to find the motion vector for
each Macroblock we have to compute MAE (i, j) at each location in the search space.
As we compute the MAE value at every location in the search area, the Estimation process
will be highly accurate. As a result the outcome of the Motion Compensation process will
consist of very few difference element values.
The main advantage of the full search method is that it guarantees finding the minimum
MAE value (it is an optimal Estimation algorithm). Hence, this algorithm is highly accurate
in finding the best possible match.
However, to compare each and every pixel for each Macroblock in the frame requires a
substantial amount of computing resources (i.e. the algorithm is computationally expensive).
Hence, in time critical applications such as sports broadcasting this method proves
insufficient in providing compression at high computation speeds.
Therefore, alternative methods have been developed to achieve sub-optimum performance at
significantly reduced complexity compared to full search methods.
COMPLEXITY OF THE ALGORITHM –
For a picture resolution of I × J and a picture rate of F pictures per second the overall
complexity is I J F (2p +1)² × MN × 3.
M N
where M × N is the frame size of the video clip and p indicates the dimensions of the search
area defined for finding a match.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
14
2.5 THE LOGARITHMIC SEARCH ALGORITHM
The Logarithmic Search algorithm was proposed in order to bring the computation
requirements for Motion Estimation to more acceptable levels. The Logarithmic Search
algorithm is very similar to the binary search method. It focuses on the reduction in number
of pixel comparisons at the expense of some amount of error introduced in the reconstructed
clip.
The working of this algorithm is given below –
In the first step, [-p, p] search rectangle is divided into two areas: one inside a [-
p/2, p/2] rectangle and one outside it.
Furthermore, instead of searching the whole [-p/2, p/2] area, we only compute the
MAE function of nine locations: at (0, 0) and at the eight major points in the
perimeter of the [-p/2, p/2] area.
That is, if the distance between the points is d1, we compute the minimum MAE from
the MAE computed at (0, 0), (0, d1), (-d1, 0), (d1, 0), (d1, d1), (d1, -d1), (-d1, d1) and
(-d1, -d1). The distance d1 is given by:
d1 = 2
where k = log p
COMPLEXITY OF THE ALGORITHM –
Overall, this method examines 8k + 1 search locations and computes the MAE. Hence, for a
picture resolution of I × J and a picture rate of F pictures per second the overall complexity of
logarithmic search is - I J F × (8k + 1) × 3
k -1
2
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
15
3. DESIGN DETAILS
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
16
3.1 STRUCTURE OF ENCODER / DECODER
THE ENCODER:
AVI SPLITTER
AVI SOURCE FILE
MOTION ESTIMATION
MOTION COMPENSATION
MOTIONVECTORS (u,v)
DIFFERENCESIGNAL
COMPRESSED FILECREATION .mcp FILE
TO DECODER
KEY FRAME SUCCESIVE FRAMES
The encoder takes as input an AVI file which is then passed onto the AVI splitter block. The
AVI splitter block splits the file into its component frames which are in the Bitmap (.BMP)
format.
The Key Frames and Intermediate frames are fed to the Motion Compensation Block which
calculates the Motion Vectors. Both the Motion Vectors and the Key frames are written into
the compressed (.MCP) file format. The resulting .MCP file is the compressed form of the
source AVI file.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
17
THE DECODER:
.mcp FILE
.mcp DATAEXTRACTOR
MOTION VECTORADDITION BLOCK
REGENERATEDFRAMES
AVI CONSTRUCTOR
REGENERATEDAVI FILE
KEY FRAMES+
MOTION VECTORS+
DIFFERENCE VALUES
The compressed .MCP file is obtained from the Encoder block. The file is then sent to the
.MCP Data Extractor block. This block is responsible for extracting the information about the
Key frames as well as the Motion Vectors encoded in the .mcp file.
Once the relevant data has been identified we can go about the process of regenerating the
original frames. The Motion Vectors and the respective Difference Values are added to the
Key frames to regenerate the frames.
Finally, the frames are sent to the AVI Constructor which combines them to produce the
regenerated AVI file.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
18
3.2 COMPRESSED FILE HEADER FORMAT (.mcp FILE)
The Header part of the file consists of:
The first 4 bytes of the file are the letters “MEAC”. These letters are used for file
identification.
The next byte following indicate the size of Macroblock used for Motion Estimation.
The next byte indicates whether Motion Compensation is used or not.
After this the 56-byte AVI Header is stored for recreation of AVI at the decoder end.
Next the 56-byte AVI Stream Header is stored for recreation of AVI at the decoder
end.
After this the 40-byte Device Independent Bitmap (DIB) Header is stored.
The format of each frame stored is as follows –
Each frame can be stored completely as a Key Frame or it is stored in terms of Motion
Vectors and Motion Compensation data with respect to most recent Key Frame. Thus each
frame has format:
Key Frame header has 8 bytes which identify the presence of a key frame and also
indicate whether the Key Frame is Run Length Encoded or not. The first three bytes
are “KEY”. The next byte is set if the Key Frame is Run Length Encoded else it is
reset. The remaining 4 bytes of the header give the size of the Key Frame data.
Key Frame data is stored immediately after the Key Frame header.
Motion Vectors are stored as 2 bytes for each Macroblock. The number of Motion
Vectors can be obtained from Macroblock size and from the width & height of each
frame
Motion Compensation data is Run Length Encoded. The first 4 bytes indicate the size
of the data stored. The data is stored immediately after these 4 bytes.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
19
SPLIT THE FILE ?
Yes
No
SELECTCOMPRESSION
METHOD
AVI SPLIT
FULLSEARCH LOG SEARCH
STOP
DISPLAYCOMPRESSION STATISTICS
PERFORM MOTION COMPENSATION
WRITE .mcp FILE
READ .avi HEADER
INPUT .avi SOURCE FILE
START
IS HEADERVAILD
No
Yes
DISPLAY"INVAILD FILE FORMAT"
3.3 PROCESS FLOWCHARTS
FLOWCHART FOR ENCODER:
1 2
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
20
AVI SPLIT
FRAME = 1
READ NEXT FRAMEFROM .avi FILE
WRITE FRAMETO .bmp FILE
FRAME = FRAME + 1
IS FRAME <=
TOTAL FRAMES
Yes
No
STOP
PROCEDURE AVISPLIT:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
21
Y = 0
X = 0
B
C
Y = 0
IS X - P < 0
Yes No
Yes NoIS Y - P < 0
XMAX = HORMAX
XMIN = X - P
Yes No
XMAX = X + P
Yes No
YMIN = Y - PYMIN = 0
XMIN = 0
YMAX = VERMAX YMAX = Y + P
A
IS X + P >= HORMAX
IS Y + P >= VERMAX
FULLSEARCH
PROCEDURE FULLSEARCH:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
22
CALCULATE MAE AT POINT (I,J)
IS MAE < minMAEYes
minMAE = MAEMOTION VECTOR = (I,J)No
minMAE = 999999
A
IS X < HORMAXBYes
Y = Y + 1
No
IS Y < VERMAX
STOP
No
CYes
X = X + 1
I = I + 1
Yes
No
J = J + 1
Yes
No
J = YMIN
I = XMIN
IS I < XMAX
IS J < YMAX
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
23
LOGSEARCH
Y = 0
X = 0
dist = p / 2
CALCULATE MAE ATPOINT (X,Y)
I = 0
CALCULATE MAE AT POSITIONS:(X- dist,y-dist)(X+dist,Y-dist)
(X+dist,Y)(X-dist,Y)(X+dist,Y)(X-dist,Y)
(X-dist,Y+dist)(X,Y+dist)
(X+dist,Y+dist)
FIND MINIMUM MAE AND STORE IN MINMAE
STORE POSITION OF MINMAEAND SET THIS POSITION AS (X,Y)
I = I + 1
IS I < STEPS
Yes MOTION VECTOR = (X,Y)
ANo
B
C
PROCEDURE LOGSEARCH:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
24
A
X = X + 1
IS X < HORMAXBYes
Y = Y + 1
No
IS Y < VERMAX
STOP
No
CYes
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
25
START
IS HEADERVAILD
No
Yes
DISPLAY"INVAILD FILE FORMAT"
READ .mcp HEADER
INPUT .mcp SOURCE FILE
REGENERATEFRAMES
WRITE.avi FILE
STOP
FLOWCHART FOR DECODER:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
26
PROCEDURE REGENERATE FRAMES
ADD MOTION VECTORTO KEY FRAME
ADD MOTION VECTORTO KEY FRAME
ADD COMPENSATIONERROR VALUE
STOP
READ KEY FRAMES,MOTION VECTORS
REGENERATE FRAMES
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
27
3.4 CLASS DIAGRAMS
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
28
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
29
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
30
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
31
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
32
3.5 INTERFACE DESIGN
SNAPSHOT OF ENCODER (COMPRESSION) WINDOW:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
33
SNAPSHOT OF DECODER (DECOMPRESSION) WINDOW:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
34
SNAPSHOT OF COMPRESSION STATISTICS WINDOW:
SNAPSHOT OF FRAME BROWSER WINDOW:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
35
4. TESTING RESULTS AND CONCLUSIONS
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
36
The testing process in the case of this project aims to bring out the relative strengths and
weaknesses of the compression utility. In this case, the performance of the utility has been
tested in three different scenarios: a best case, a worst case and a general case.
4.1 TEST PLATFORM
The test platform was chosen with a view of using a system with an average configuration.
All tests were carried out on the following platform –
PROCESSOR Intel Pentium III 500 MHz
SYSTEM MEMORY 128 MB
OPERATING SYSTEM Redhat Linux 9.0
4.2 TEST CRITERIA AND VARIABLES
Conclusions have been drawn based upon the performance of the two algorithms on the basis
of the following parameters:
Complexity (Mega Operations per Second)
Compression Ratio – The ratio of reduction in size of file to the size of original file.
Computation Time – The amount of processor time utilized for a particular operation
Perceptual quality of the regenerated file.
Each of the three test cases is provided as input to the utility. The effect of varying the design
parameters like Macro Block size, Key Frame Interval, Search Region size (p) on the
performance of the two algorithms are also recorded.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
37
4.3 SELECTION OF TEST CASES
TEST CASE 1:
Filename clockuc.avi
Size 1,236KB
RLE Compressed Frames No
Bits per pixel 8
Dimensions of Frame 321 x 321
Frames per second 1
Type of Video Object undergoing motion, background is constant
This test file represents the ideal case of input that could be presented to the utility. In this
case the object in the foreground is undergoing motion while the background remains
absolutely constant. The AVI file is in the raw format (non-RLE encoded) and operates at a
very low rate of 1 frame per second.
TEST CASE 2:
Filename gyrobotuc.avi
Size 10,694KB
RLE Compressed Frames No
Bits per pixel 24
Dimensions of Frame 160 x 120
Frames per second 25
Type of VideoObject is constant, background is under going motion
This test file represents a more practical case of AVI file the compression utility is likely to
encounter. The object in the foreground is relatively constant while the background is in
motion. The file has a frame rate of 25 with a 24-bit Bitmap representation.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
38
TEST CASE 3:
Filename mirranew.avi
Size 63,730KB
RLE Compressed Frames No
Bits per pixel 24
Dimensions of Frame 320 x 240
Frames per second 15
Type of VideoBoth object and background are undergoing motion
This file represents the worst case scenario for the utility. The file depicts a series of high
speed stunts being carried out on a bicycle. As a result both the background and the
foreground are in motion simultaneously. Also the AVI runs at a rate of 15 frames per second
with a 24-bit colour representation.
In this case, the test file violates the basic assumption of Motion Estimation algorithms that
only translatory motion model is assumed for objects in the scene and thus a rectangular
geometry is sufficient.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
39
4.4 TEST RESULTS
MOTION ESTIMATION ONLY (NO COMPENSATION)
A. COMPLEXITY:
TEST CASE 1: clockuc.avi
ParametersComplexity
(Mega Operations per second)
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 69.5 7.7
4 3 7 69.5 7.7
8 5 7 69.5 7.7
8 3 15 297.1 10.2
TEST CASE 2: gyrobotuc.avi
ParametersComplexity
(Mega Operations per second)
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 324 36
4 3 7 324 36
8 5 7 324 36
8 3 15 1383.8 47.5
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
40
TEST CASE 3: mirranew.avi
ParametersComplexity
(Mega Operations per second)
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 777.6 86.4
4 3 7 777.6 86.4
8 5 7 777.6 86.4
8 3 15 3321.2 114.0
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
41
B. COMPRESSION RATIO:
TEST CASE 1: clockuc.avi
Input File Size: 1,236KB
Parameters Full search Logarithmic Search
Macro
Block
size
Key
Frame
Interval
Search
Region
size (p)
Output File
Size
Compression
Ratio
Output File
Size
Compression
Ratio
8 3 7 42KB 96.59% 42KB 96.59%
4 3 7 117KB 90.52% 117KB 90.52%
8 5 7 42KB 96.64% 42KB 96.64%
8 3 15 42KB 96.59% 42KB 96.59%
TEST CASE 2: gyrobotuc.avi
Input File Size: 10,694KB
Parameters Full search Logarithmic Search
Macro
Block
size
Key
Frame
Interval
Search
Region
size (p)
Output File
Size
Compression
Ratio
Output File
Size
Compression
Ratio
8 3 7 3,674KB 65.64% 3,674KB 65.64%
4 3 7 3,896KB 63.57% 3,896KB 63.57%
8 5 7 2,226KB 79.18% 2,226KB 79.18%
8 3 15 3,674KB 65.64% 3,674KB 65.64%
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
42
TEST CASE 3: mirranew.avi
Input File Size: 63,730KB
Parameters Full search Logarithmic Search
Macro
Block
size
Key
Frame
Interval
Search
Region
size (p)
Output File
Size
Compression
Ratio
Output File
Size
Compression
Ratio
8 3 7 14,630KB 77.05% 14,630KB 77.05%
4 3 7 15,920KB 75.02% 15,920KB 75.02%
8 5 7 9,062KB 85.78% 9,062KB 85.78%
8 3 15 14,630KB 77.05% 14,630KB 77.05%
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
43
C. COMPUTATION TIME
TEST CASE 1: clockuc.avi
Parameters Computation Time
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 1.25s 0.26s
4 3 7 0.89s 0.31s
8 5 7 1.41s 0.28s
8 3 15 4.33s 0.29s
TEST CASE 2:gyrobotuc.avi
Parameters Computation Time
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 15.03s 3.06s
4 3 7 21.64s 3.76s
8 5 7 17.98s 3.48s
8 3 15 29.03s 3.47s
TEST CASE 3: mirranew.avi
Parameters Computation Time
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 104.43s 18.89s
4 3 7 121.96s 21.37s
8 5 7 123.89s 21.20s
8 3 15 316.21s 22.37s
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
44
MOTION ESTIMATION AND MOTION COMPENSATION
A. COMPLEXITY:
TEST CASE 1: clockuc.avi
ParametersComplexity
(Mega Operations per second)
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 69.5 7.7
4 3 7 69.5 7.7
8 5 7 69.5 7.7
8 3 15 297.1 10.2
TEST CASE 2:gyrobotuc.avi
ParametersComplexity
(Mega Operations per second)
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 324 36
4 3 7 324 36
8 5 7 324 36
8 3 15 1383.8 47.5
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
45
TEST CASE 3: mirranew.avi
ParametersComplexity
(Mega Operations per second)
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 777.6 86.4
4 3 7 777.6 86.4
8 5 7 777.6 86.4
8 3 15 3321.2 114.0
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
46
B. COMPRESSION RATIO:
TEST CASE 1: clockuc.avi
Input File Size: 1,236KB
Parameters Full search Logarithmic Search
Macro
Block
size
Key
Frame
Interval
Search
Region
size (p)
Output File
Size
Compression
Ratio
Output File
Size
Compression
Ratio
8 3 7 76KB 93.85% 78KB 93.67%
4 3 7 145KB 88.27% 149KB 87.92%
8 5 7 82KB 93.39% 83KB 93.26%
8 3 15 75KB 93.95% 79KB 93.60%
TEST CASE 2: gyrobotuc.avi
Input File Size: 10,694KB
Parameters Full search Logarithmic Search
Macro
Block
size
Key
Frame
Interval
Search
Region
size (p)
Output File
Size
Compression
Ratio
Output File
Size
Compression
Ratio
8 3 7 6,401KB 40.14% 6,289KB 41.19%
4 3 7 6,986KB 34.67% 3,892KB 35.83%
8 5 7 5,621KB 47.44% 5,488KB 48.68%
8 3 15 5,636KB 47.30% 5,500KB 48.57%
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
47
TEST CASE 3: mirranew.avi
Input File Size: 63,730KB
Parameters Full search Logarithmic Search
Macro
Block
size
Key
Frame
Interval
Search
Region
size (p)
Output File
Size
Compression
Ratio
Output File
Size
Compression
Ratio
8 3 7 35,260KB 44.67% 35,150KB 44.85%
4 3 7 35,960KB 43.57% 36,340KB 42.98%
8 5 7 33,990KB 46.67% 33,940KB 46.74%
8 3 15 35,220KB 44.73% 35,220KB 44.73%
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
48
C. COMPUTATION TIME
TEST CASE 1: clockuc.avi
Parameters Computation Time
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 1.81s 0.85s
4 3 7 1.50s 0.90s
8 5 7 2.05s 0.94s
8 3 15 4.97s 0.87s
TEST CASE 2: gyrobotuc.avi
Parameters Computation Time
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 16.92s 4.78s
4 3 7 23.69s 5.62s
8 5 7 20.10s 5.60s
8 3 15 37.48s 6.08s
TEST CASE 3: mirranew.avi
Parameters Computation Time
Macro Block
size
Key Frame
Interval
Search Region
size (p)Full search
Logarithmic
Search
8 3 7 116.37s 29.99s
4 3 7 135.27s 33.06s
8 5 7 139.31s 34.57s
8 3 15 332.58s 33.63s
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
49
Frame no. 243(KEYFRAME)
Frame no. 244
Frame no. 245
4.5 SAMPLE FRAMES
Following are some sample frames at different stages of the encoding process using the Full
Search method. The sample frames are part of the mirranew.avi test file with the variables at
their default values.
ORIGINAL FRAMES (AT THE ENCODER):
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
50
Frame no. 243 (KEYFRAME)
Frame no. 244
Frame no. 245
FRAMES AFTER ADDITION OF MOTION VECTORS (AT THE DECODER):
Frames 244 and 245 shown here have been generated from the KEYFRAME (frame no. 243)
by adding the corresponding Motion Vectors from the .mcp file given as input to the decoder.
Frame 243 + Motion Vectors = Frame 244, Frame 245
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
51
Frame no. 243 (KEYFRAME)
Frame no. 244
Frame no. 245
FRAMES AFTER MOTION COMPENSATION PROCESS:
Frames 244, and 245 shown here, have been regenerated from the KEYRAME (to which
Motion Vectors have already been added). The regeneration operation has been performed by
adding the difference values from the .mcp file.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
52
4.6 CONCLUSIONS
Based on the test results recorded, we can derive the following conclusions about the
performance of the two algorithms:
The computational complexity of the Full Search algorithm is several orders of magnitude
greater than that of the Logarithmic Search algorithm. As the search region value increases,
the computational complexity of the Full Search algorithm increases drastically as compared
to the Logarithmic Search. For a search area of 7 to a search area of 15, the computational
complexity of the Full Search method as compared to the Logarithmic Search increases from
a factor of 9 to about 30.
Both the algorithms are evenly matched when considering their respective compression
ratios. From the test data, we can see that higher compression was achieved by the Full
Search method in the first test case whereas the ratio was better for the Logarithmic Search
method in case of test cases 2 and 3. This difference in compression ratios is largely
dependant on the nature of the source AVI file and in the given test cases it varies through a
range of +/- 1%. The compression ratio is also directly proportional to the Macroblock size
i.e. as the Macroblock size is increased the Compression Ratio increases but at the cost of
perceptual quality.
The computation time is a factor of Macroblock size and size of search area. The keyframe
value determines how many frames have to be encoded as motion vectors and hence as its
value increases, more Motion Vectors have to be calculated. As the Macroblock size
increases more number of comparisons needs to be made in the evaluation of the MAE value.
Similarly, the number of computations also increases when the search area increases due to
the same reason.
The major pitfall of the Full Search Method is highlighted on the basis of the computation
times. In all the test cases, the computation times for the Full Search method are extremely
high as compared to the Logarithmic Search Method. This makes the Full Search Algorithm
impracticable in time critical applications like broadcasting of live video.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
53
Since the Full search method computes the MAE value for every Macroblock in the search
region, the perceptual quality of the Motion Estimated clip is far superior to that of the
Logarithmic Search Method. The errors introduced in the Motion Estimation stage are almost
completely removed in the Motion Compensation process producing the same perceptual
quality as the source file in both methods. Hence, if Motion Compensation is employed the
relative degradation in perceptual quality in the Logarithmic Search method becomes
insignificant.
Thus, although less accurate with respect to the Motion Estimation process, the Logarithmic
Search stands out as a superior method when evaluated of practical considerations like
Computational complexity and Computation time.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
54
5. APPLICATIONS AND FUTURE SCOPE
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
55
5.1 APPLICATIONS
Video finds use in a number of applications. With video compression, an entire gamut of
services comes alive due to the reduced transmission rate. We enumerate a few of the many
applications possible:
REDUCED STORAGE NEEDS:
Our project reduces the file size of an AVI input by converting it into compressed video
(MCP) format. This file can be then stored on a device to save space. With reduced video
file sizes, it is possible to cram more amounts of video data on devices like CDs, digital
cameras and a host of other devices. When the file needs to be viewed, it can be supplied
as input to the utility which will then convert it into an AVI file suitable for playing on a
media player.
VIDEO ON DEMAND:
Video on demand can be compared to an online electronic rental store. The user selects
any one of a large number of videos available in the electronic video library. A central
server acts as the video repository. The files here are stored in compressed video format
(MCP). The server sets up a connection with the user computer using TCP/IP. The
contents of the video are sent over the network and simultaneously played on the user’s
computer. Since the video is compressed, the bandwidth requirements will be much less.
At the user’s end the data received from the server is given as input to our utility. The
decompression can be done on the fly and displayed to the user.
This technique will be largely beneficial not only to the consumer but also to the
electronic rental store. The consumer can be given an account which can be charged as
and when he decides to view a video. The video will not be stored on the user’s
computer, thus saving his storage needs. Also since the video is not on a device
accessible to the user, the extent of video piracy will decrease. This will monetarily
benefit the electronic rental store.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
56
ONLINE VIDEO CONFERENCING:
One of the prime applications of our video compression utility will be online video
conferencing. This would make a simultaneous real-time video conversation between
numbers of individuals possible.
Web cams can be used to generate the video which will serve as the streaming input to
utility. The compressed videos will obviate the need for a higher requirement for
bandwidth and will make optimum use of the provided resources. We do not generally
associate conferences with lots of movement; hence it is safe to presume that there is very
little difference in motion between consecutive frames transmitted. This will render the
quality of pictures near perfect to the original images captured. Moreover, picture quality
is expendable in such cases, whereas jerks in video would certainly not be very desirable.
Bandwidth reduction will be the key criteria for such applications and hence video
compression will be utilized heavily.
TELEVISION BROADCASTING:
Our utility is perfectly suited to compress AVIs which have very little interframe motion.
This makes it particularly useful in the broadcast of reduced motion images such as news
bulletins.
News bulletins typically portray the news reader at the right of the picture with a small
screen behind flashing the headlines or news clips. If we consider this scenario, there is
very little interframe motion viz. the lip movement of the news reader and the motion of
the news clips in the background. This formulates an optimum case situation for the
video compression utility which offers the best utilization of the bandwidth available.
Live reporting of news events can be made possible at a fraction of the cost.
Moreover, due to the reduced bandwidth required for the compressed video, an optimal
usage of the available bandwidth is possible for incorporating other information. With
technologies like HDTV, digital video broadcasting will only increase in volume.
Compression of such video transmissions will be of utmost priority.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
57
5.2 FUTURE SCOPE
The compression utility currently works only on audio-subdued AVI files. In addition to this
support has been provided only for the Raw-AVI and RLE Encoded AVI file formats.
Further efforts can be taken in the following directions to enhance the features of the utility
as well as achieve better results.
ADDITIONAL CODEC SUPPORT :
There are over 350 plus Codec formats available for the AVI file type. In addition to
supporting the current raw AVI and RLE AVI, the project can be extended to support
some of the other popular AVI Codec formats.
Support for other formats can be included by including additional functions for
extracting the header information included. Also, appropriate code can be written for
processing the Codec specific data stored in the file.
SUPPORT FOR SOUND:
The project can be extended to support compression and transmission of sound data
along with video for a more realistic approach to motion picture compression.
THE LIST “MOVI” CHUNK in the AVI header has the actual stream data that is the
picture and the sound which are differentiated by the four character codes indicating
the start of each type of chunk. The audio data contained in these data chunks does
not contain any information about its format (Refer Appendix-A for header format).
The audio data can be removed from the input file and compressed using certain
audio compression technique. This compressed data can then be appended to the
compressed file (the .mcp file). While decoding the compressed sound data should be
properly decompressed and appended to the recreated AVI file. Care must be taken
that the audio data is placed in its proper location specified in the AVI file format.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
58
USE OF NON-RLE CODING METHODS:
The project currently employs RLE for compressing the difference signal obtained
after Motion Compensation. However, the use of other coding techniques like
Huffman Coding to perform the same operation will in most cases produce higher
compression levels.
The probabilistic nature of the difference signal data will allow the utility to achieve
higher compression ratios. Due to the accuracy of the Motion Estimation process the
difference signal will contain relatively little data (i.e. most of the difference values
are zero). As a result the use of the Huffman coding Method will substantially
increase the compression achieved.
However, the increased compression levels can be achieved only at the expense of
higher complexity and processing time than the RLE method. Thus, the adoption of
advanced coding techniques would be feasible if a compression ratio is the prime
necessity of the video coder.
DESIGN OF A HYBRID CODER :
This project involved the design and implementation of an Interframe Coder using the
principles of Motion Estimation and Compensation only. The project can be extended
to include an Intra-frame Coding method as a part of the Encoder.
For the process of intraframe coding, either the DCT or Wavelet transform can be
adopted. The DCT / Wavelet transform module can be integrated into the utility after
Motion Estimation / Compensation modules. Similarly, on the decoder side the
module for regenerating the DCT/Wavelet compressed frames must be included
before the addition of Motion Vectors to the Key frames is done.
The design of such a hybrid coder (interframe/intraframe) is the basis for current
video coding standards. It will lead to higher levels of compression than both
interframe and intraframe encoding performed individually.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
59
HARDWARE BASED IMPLEMENTATION :
To improve the computation times drastically, a completely hardware based system
could be constructed for the Estimation and compensation process. This would
substantially reduce the computation times for the compression process.
Modern computer graphics hardware contains extremely powerful graphics
processing units (GPU). These GPUs are designed to perform a limited number of
operations on very large amounts of data. They typically have more than one
processing pipeline working in parallel with each other. They can in fact be thought
of as highly parallel Single Instruction Multiple Data (SIMD) type processors.
Current NVIDIA GeForce FX 5900 GPU performance peaks at 20 Gigaflops. This is
equivalent to a 10-GHz Pentium 4 processor. The latest generation of graphics
hardware also contains much more programmable GPU’s. The increasing
performance of the GPU’s can be harnessed to perform the Motion Estimation /
compensation process purely in terms of Hardware Embedded Code to achieve
extremely fast computations.
An appropriate algorithm flow for Full search algorithm using the GPU as a co-
processor for the CPU can be described as follows:
Firstly the two frames are downloaded as textures to the graphics hardware. These are
noted as Texture0 and Texture1 respectively. The current motion vector to be
checked is passed as a parameter to the GPU. The Motion vector for the two frames is
then generated using vertex and fragment programs. In this way if a frame needs to be
interpolated it will be interpolated on the GPU. This results in an image which is the
absolute value of the Motion vectors for the two frames. This image is then read back
to the CPU where a Mean absolute error measure for each block in the image is
generated. This is repeated for each motion vector in the candidate set. The motion
vector which yielded the smallest Mean absolute error for each block is chosen as the
motion vector for that block. (For further references check papers on motion
estimation and compensation using GPUs by Yang and Welch.)
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
60
APPENDIX – A
AVI FILE HEADER FORMAT
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
61
AVI FILES:
The Microsoft Audio/Video Interleaved (AVI) file format is a RIFF file specification used
with applications that capture, edit, and playback audio/video sequences. In general, AVI
files contain multiple streams of different types of data. Most AVI sequences will use both
audio and video streams. A simple variation for an AVI sequence uses video data and does
not require an audio stream. This section describes AVI files containing only audio and video
data.
This section covers the following topics:
The required chunks of an AVI file
The optional chunks of an AVI file
AVI RIFF FORM
AVI files use the AVI RIFF form. The AVI RIFF form is identified by the four-character
code “AVI”. All AVI files include two mandatory LIST chunks. These chunks define the
format of the streams and stream data. AVI files might also include an index chunk. This
optional chunk specifies the location of data chunks within the file. An AVI file with these
components has the following form:
RIFF ('AVI '
LIST ('hdrl'
.
.
)
LIST ('movi'
.
.
)
['idx1'<AVI Index>]
)
The LIST chunks and the index chunk are subchunks of the RIFF “AVI ” chunk. The “AVI ”
chunk identifies the file as an AVI RIFF file. The LIST “hdrl” chunk defines the format of
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
62
the data and is the first required list chunk. The LIST “movi” chunk contains the data for the
AVI sequence and is the second required list chunk. The “idx1” chunk is the optional index
chunk. AVI files must keep these three components in the proper sequence. The LIST “hdrl”
and LIST “movi” chunks use subchunks for their data. The following example shows the
AVI RIFF form expanded with the chunks needed to complete the LIST “hdrl” and LIST
“movi” chunks:
RIFF ('AVI '
LIST ('hdrl'
'avih'(<Main AVI Header>)
LIST ('strl'
'strh'(<Stream header>)
'strf'(<Stream format>)
'strd'(additional header data)
.
.
)
.
.
)LIST ('movi'
{SubChunk | LIST('rec '
SubChunk1
SubChunk2
.
.
)
.
.
}
.
.
) ['idx1'<AVIIndex>]
)
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
63
The following sections describe the chunks contained in the LIST “hdrl” and LIST “movi”
chunks as well as the “idx1” chunk.
THE MAIN AVI HEADER LIST
The AVI file header is identified with “avih” four-character code. The main header has the
following data structure defined for it:
typedef struct {
DWORD dwMicroSecPerFrame;
DWORD dwMaxBytesPerSec;
DWORD dwReserved1;
DWORD dwFlags;
DWORD dwTotalFrames;
DWORD dwInitialFrames;
DWORD dwStreams;
DWORD dwSuggestedBufferSize;
DWORD dwWidth;
DWORD dwHeight;
DWORD dwScale;
DWORD dwRate;
DWORD dwStart;
DWORD dwLength;
} MainAVIHeader;
The dwMicroSecPerFrame field specifies the period between video frames.
The dwMaxBytesPerSec field specifies the approximate maximum data rate of the file.
The dwFlags field contains any flags for the file. The following flags are defined:
AVIF_HASINDEX
Indicates the AVI file has an “idx1” chunk.
AVIF_MUSTUSEINDEX
Indicates the index should be used to determine the order of presentation of the data.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
64
AVIF_ISINTERLEAVED
Indicates the AVI file is interleaved.
AVIF_WASCAPTUREFILE
Indicates the AVI file is a specially allocated file used for capturing real-time video.
AVIF_COPYRIGHTED
Indicates the AVI file contains copyrighted data.
The dwTotalFrames field of the main header specifies the total number of frames of data in
file.
The dwInitialFrames is used for interleaved files.
The dwStreams field specifies the number of streams in the file.
The dwSuggestedBufferSize field specifies the suggested buffer size for reading the file.
Generally, this size should be large enough to contain the largest chunk in the file.
The dwWidth and dwHeight fields specify the width and height of the AVI file in pixels.
The dwScale and dwRate fields are used to specify the general time scale that the file will
use.
The dwStart and dwLength fields specify the starting time of the AVI file and the length of
the file. The units are defined by dwRate and dwScale. The dwStart field is usually set to
zero.
THE STREAM HEADER (“STRL”) CHUNKS
The main header is followed by one or more “strl” chunks. Each “strl” chunk must contain a
stream header and stream format chunk. Stream header chunks are identified by the four-
character code “strh” and stream format chunks are identified with the four-character code
“strf”. In addition to the stream header and stream format chunks, the “strl” chunk might
also contain a stream data chunk. Stream data chunks are identified with the four-character
code “strd”.
The stream header has the following data structure defined for it:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
65
typedef struct {
FOURCC fccType;
FOURCC fccHandler;
DWORD dwFlags;
DWORD dwReserved1;
DWORD dwInitialFrames;
DWORD dwScale;
DWORD dwRate;
DWORD dwStart;
DWORD dwLength;
DWORD dwSuggestedBufferSize;
DWORD dwQuality;
DWORD dwSampleSize;
} AVIStreamHeader;
The fccType field is set to “vids” if the stream it specifies contains video data. It is set to
“auds” if it contains audio data.
The fccHandler field contains a four-character code describing the installable compressor or
decompressor used with the data.
The dwFlags field contains any flags for the data stream. The
AVISF_DISABLED flag indicates that the stream data should be rendered only when
explicitly enabled by the user. The
AVISF_VIDEO_PALCHANGES flag indicates palette changes are embedded in the file.
The dwInitialFrames is used for interleaved files. If you are creating interleaved files,
specify the number of frames in the file prior to the initial frame of the AVI sequence in this
field.
The remaining fields describe the playback characteristics of the stream. These factors
include the playback rate (dwScale and dwRate), the starting time of the sequence
(dwStart), the length of the sequence (dwLength), the size of the playback buffer
(dwSuggestedBuffer), an indicator of the data quality (dwQuality), and sample size
(dwSampleSize).
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
66
THE LIST “MOVI” CHUNK
Following the header information is a LIST “movi” chunk that contains chunks of the actual
data in the streams; that is, the pictures and sounds themselves. The data chunks can reside
directly in the LIST “movi” chunk or they might be grouped into “rec ” chunks. Like any
RIFF chunk, the data chunks contain a four-character code to identify the chunk type. The
four-character code that identifies each chunk consists of the stream number and a two-
character code that defines the type of information encapsulated in the chunk. For example, a
waveform chunk is identified by a two-character code of “wb”. If a waveform chunk
corresponded to the second LIST “hdrl” stream description, it would have a four-character
code of “01wb”.
Since all the format information is in the header, the audio data contained in these data
chunks does not contain any information about its format. An audio data chunk has the
following format (the ## in the format represents the stream identifier):
WAVE Bytes '##wb'
BYTE abBytes[];
Video data can be compressed or uncompressed DIBs. An uncompressed DIB has BI_RGB
specified for the biCompression field in its associated BITMAPINFO structure. A
compressed DIB has a value other than BI_RGB specified in the biCompression field.
A data chunk for an uncompressed DIB contains RGB video data. These chunks are
identified with a two-character code of “db” (db is an abbreviation for DIB bits). Data chunks
for a compressed DIB are identified with a two-character code of “dc” (dc is an abbreviation
for DIB compressed). Neither data chunk will contain any header information about the
DIBs. The data chunk for an uncompressed DIB has the following form:
DIB Bits '##db'
BYTE abBits[];
The data chunk for a compressed DIB has the following form:
Compressed DIB '##dc'
BYTE abBits[];
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
67
Video data chunks can also define new palette entries used to update the palette during an
AVI sequence. These chunks are identified with a two-character code of “pc” (pc is an
abbreviation for palette change). The following data structure is defined palette information:
typedef struct {
BYTE bFirstEntry;
BYTE bNumEntries;
WORD wFlags;
PALETTEENTRY peNew;
} AVIPALCHANGE;
Fields
bFirstEntry
Specifies the first palette entry to change.
bNumEntries
Specifies the number of entries to change.
wFlags
Reserved. (This should be set to 0.)
peNew
Specifies an array of new palette entries.
The bFirstEntry field defines the first entry to change and the bNumEntries field specifies
the number of entries to change. The peNew field contains the new color entries.
If you include palette changes in a video stream, set the AVITF_VIDEO_PALCHANGES
flag in the dwFlags field of the stream header. This flag indicates that this video stream
contains palette changes and warns the playback software that it will need to animate the
palette.
THE “IDX1” CHUNK
AVI files can have an index chunk after the LIST “movi” chunk. The index chunk
essentially contains a list of the data chunks and their location in the file. Index chunks use
the four-character code “idx1”. The following data structure is defined for index entries:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
68
typedef struct {
DWORD ckid;
DWORD dwFlags;
DWORD dwChunkOffset;
DWORD dwChunkLength;
} AVIINDEXENTRY;
The ckid, dwFlags, dwChunkOffset, and dwChunkLength entries are repeated in the AVI
file for each data chunk indexed. If the file is interleaved, the index will also have these
entries for each “rec” chunk. The “rec” entries should have the AVIIF_LIST flag set and the
list type in the ckid field. The ckid field identifies the data chunk. This field uses four-
character codes for identifying the chunk.
The dwFlags field specifies any flags for the data. The AVIIF_KEYFRAME flag indicates
key frames in the video sequence. Key frames do not need previous video information to be
decompressed. The AVIIF_NOTIME flag indicates a chunk does not affect the timing of a
video stream. The AVIF_LIST flag indicates the current chunk is a LIST chunk. Use the
ckid field to identify the type of LIST chunk.
The dwChunkOffset and dwChunkLength fields specify the position of the chunk and the
length of the chunk. The dwChunkOffset field specifies the position of the chunk in the file
relative to the 'movi' list. The dwChunkLength field specifies the length of the chunk
excluding the eight bytes for the RIFF header.
If you include an index in the RIFF file, set the AVIF_HASINDEX in the dwFlags field of
the AVI header. (This header is identified by “avih” chunk ID.) This flag indicates that the
file has an index.
OTHER DATA CHUNKS
If you need to align data in your AVI file you can add a “JUNK” chunk. The “JUNK” chunk
has the following form:
AVI Padding 'JUNK'
Byte data[]
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
69
APPENDIX – B
BITMAP FILE FORMAT
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
70
This topic describes the Bitmap-File Formats originally designed for the Windows Operating
System. Windows bitmap files are stored in a device-independent bitmap (DIB) format that
allows the system to display the bitmap on any type of display device. The term "device
independent" means that the bitmap specifies pixel color in a form independent of the
method used by a display to represent color. The default filename extension of a Windows
DIB file is .BMP.
BITMAP-FILE STRUCTURES
Each bitmap file contains a bitmap-file header, a bitmap-information header, a color table,
and an array of bytes that defines the bitmap bits. The file has the following form:
BITMAPFILEHEADER bmfh;
BITMAPINFOHEADER bmih;
RGBQUAD aColors[];
BYTE aBitmapBits[];
The bitmap-file header contains information about the type, size, and layout of a device-
independent bitmap file. The header is defined as a BITMAPFILEHEADER structure. The
bitmap-information header, defined as a BITMAPINFOHEADER structure, specifies the
dimensions, compression type, and color format for the bitmap.
The color table, defined as an array of RGBQUAD structures, contains as many elements as
there are colors in the bitmap. The color table is not present for bitmaps with 24 color bits
because each pixel is represented by 24-bit red-green-blue (RGB) values in the actual bitmap
data area. The colors in the table should appear in order of importance. This helps a display
driver render a bitmap on a device that cannot display as many colors as there are in the
bitmap. If the DIB is in Windows version 3.0 or later format, the driver can use the
biClrImportant member of the BITMAPINFOHEADER structure to determine which
colors are important.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
71
The BITMAPINFO structure can be used to represent a combined bitmap-information
header and color table. The bitmap bits, immediately following the color table, consist of an
array of BYTE values representing consecutive rows, or "scan lines," of the bitmap. Each
scan line consists of consecutive bytes representing the pixels in the scan line, in left-to-right
order. The number of bytes representing a scan line depends on the color format and the
width, in pixels, of the bitmap. If necessary, a scan line must be zero-padded to end on a 32-
bit boundary. However, segment boundaries can appear anywhere in the bitmap. The scan
lines in the bitmap are stored from bottom up. This means that the first byte in the array
represents the pixels in the lower-left corner of the bitmap and the last byte represents the
pixels in the upper-right corner.
The biBitCount member of the BITMAPINFOHEADER structure determines the number
of bits that define each pixel and the maximum number of colors in the bitmap. These
members can have different values, the various values and numbers are as given below
1: Bitmap is monochrome and the color table contains two entries. Each
bit in the bitmap array represents a pixel. If the bit is clear, the pixel is
displayed with the color of the first entry in the color table. If the bit is
set, the pixel has the color of the second entry in the table.
4: Bitmap has a maximum of 16 colors. Each pixel in the bitmap is
represented by a 4-bit index into the color table. For example, if the first
byte in the bitmap is 0x1F, the byte represents two pixels. The first pixel
contains the color in the second table entry, and the second pixel contains
the color in the sixteenth table entry.
8: Bitmap has a maximum of 256 colors. Each pixel in the bitmap is
represented by a 1-byte index into the color table. For example, if the first
byte in the bitmap is 0x1F, the first pixel has the color of the
thirty-second table entry.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
72
24: Bitmap has a maximum of 2^24 colors. The bmiColors (or bmciColors)
member is NULL, and each 3-byte sequence in the bitmap array represents the
relative intensities of red, green, and blue, respectively, for a pixel.
The biClrUsed member of the BITMAPINFOHEADER structure specifies the number of
color indexes in the color table actually used by the bitmap. If the biClrUsed member is set
to zero, the bitmap uses the maximum number of colors corresponding to the value of the
biBitCount member. An alternative form of bitmap file uses the BITMAPCOREINFO,
BITMAPCOREHEADER, and RGBTRIPLE structures.
BITMAP COMPRESSION
Windows versions 3.0 and later support run-length encoded (RLE) formats for compressing
bitmaps that use 4 bits per pixel and 8 bits per pixel. Compression reduces the disk and
memory storage required for a bitmap.
Compression of 8-Bits-per-Pixel Bitmaps:
When the biCompression member of the BITMAPINFOHEADER structure is set to
BI_RLE8, the DIB is compressed using a run-length encoded format for a
256-color bitmap. This format uses two modes: encoded mode and absolute mode.
Both modes can occur anywhere throughout a single bitmap.
Encoded Mode
A unit of information in encoded mode consists of two bytes. The first byte specifies the
number of consecutive pixels to be drawn using the color index contained in the second byte.
The first byte of the pair can be set to zero to indicate an escape that denotes the end of a line,
the end of the bitmap, or a delta. The interpretation of the escape depends on the value of the
second byte of the pair, which must be in the range 0x00 through 0x02. Following are the
meanings of the escape values that can be used in the second byte:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
73
Second byte Meaning
0 End of line.
1 End of bitmap.
2 Delta. The two bytes following the escape contain unsigned values indicating
the horizontal and vertical offsets of the next pixel from the current position.
Absolute Mode
Absolute mode is signaled by the first byte in the pair being set to zero and
the second byte to a value between 0x03 and 0xFF. The second byte represents
the number of bytes that follow, each of which contains the color index of a
single pixel. Each run must be aligned on a word boundary. Following is an
example of an 8-bit RLE bitmap (the two-digit hexadecimal values in the
second column represent a color index for a single pixel):
Compressed data Expanded data
03 04 04 04 04
05 06 06 06 06 06 06
00 03 45 56 67 00 45 56 67
02 78 78 78
00 02 05 01 Move 5 right and 1 down
02 78 78 78
00 00 End of line
09 1E 1E 1E 1E 1E 1E 1E 1E 1E 1E
00 01 End of RLE bitmap
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
74
BITMAPFILEHEADER (3.0)
typedef struct tagBITMAPFILEHEADER { /* bmfh */
UINT bfType;
DWORD bfSize;
UINT bfReserved1;
UINT bfReserved2;
DWORD bfOffBits;
} BITMAPFILEHEADER;
The BITMAPFILEHEADER structure contains information about the type, size, and layout
of a device-independent bitmap (DIB) file.
The bfType specifies the type of file. This member must be BM.
The bfSize specifies the size of the file, in bytes.
The bfReserved1 Reserved; must be set to zero.
The bfReserved2 Reserved; must be set to zero.
The bfOffBits specifies the byte offset from the BITMAPFILEHEADER structure to the
actual bitmap data in the file.
BITMAPINFO (3.0)
typedef struct tagBITMAPINFO { /* bmi */
BITMAPINFOHEADER bmiHeader;
RGBQUAD bmiColors[1];
} BITMAPINFO;
The BITMAPINFO structure fully defines the dimensions and color information for a
Windows 3.0 or later device-independent bitmap (DIB).
The bmiHeader specifies a BITMAPINFOHEADER structure that contains information
about the dimensions and color format of a DIB.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
75
The bmiColors specifies an array of RGBQUAD structures that define the colors in the
bitmap.
BITMAPINFOHEADER (3.0)
typedef struct tagBITMAPINFOHEADER { /* bmih */
DWORD biSize;
LONG biWidth;
LONG biHeight;
WORD biPlanes;
WORD biBitCount;
DWORD biCompression;
DWORD biSizeImage;
LONG biXPelsPerMeter;
LONG biYPelsPerMeter;
DWORD biClrUsed;
DWORD biClrImportant;
} BITMAPINFOHEADER;
The BITMAPINFOHEADER structure contains information about the dimensions and
color format of a Windows 3.0 or later device-independent bitmap (DIB).
The biSize specifies the number of bytes required by the BITMAPINFOHEADER
structure.
The biWidth specifies the width of the bitmap, in pixels.
The biHeight specifies the height of the bitmap, in pixels.
The biplanes specifies the number of planes for the target device. This member must be set
to 1.
The biBitCount specifies the number of bits per pixel. This value must be 1, 4, 8, or 24.
The biCompression specifies the type of compression for a compressed bitmap.
The various values and their meaning are as follows:
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
76
BI_RGB: Specifies that the bitmap is not compressed.
BI_RLE8: Specifies a run-length encoded format for bitmaps with 8 bits per pixel. The
compression format is a 2-byte format consisting of a count byte followed by
a byte containing a color index. For more information, see the following
Comments section.
BI_RLE4 Specifies a run-length encoded format for bitmaps with 4 bits per pixel. The
compression format is a 2-byte format consisting of a count byte followed by
two word-length color indexes
The biSizeImage specifies the size, in bytes, of the image. It is valid to set this member to
zero if the bitmap is in the BI_RGB format.
The biXPelsPerMeter specifies the horizontal resolution, in pixels per meter, of the target
device for the bitmap. An application can use this value to select a bitmap from a resource
group that best matches the characteristics of the current device.
The biYPelsPerMeter specifies the vertical resolution, in pixels per meter, of the target
device for the bitmap.
The biClrUsed specifies the number of color indexes in the color table actually used by the
bitmap. If this value is zero, the bitmap uses the maximum number of colors corresponding
to the value of the biBitCount member. For more information on the maximum sizes of the
color table, see the description of the BITMAPINFO structure earlier in this topic.
If the biClrUsed member is nonzero, it specifies the actual number of colors that the
graphics engine or device driver will access if the biBitCount member is less than 24. If
biBitCount is set to 24, biClrUsed specifies the size of the reference color table used to
optimize performance of Windows color palettes. If the bitmap is a packed bitmap (that is, a
bitmap in which the bitmap array immediately follows the BITMAPINFO header and which
is referenced by a single pointer), the biClrUsed member must be set to zero or to the actual
size of the color table.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
77
The biClrImportant specifies the number of color indexes that are considered important for
displaying the bitmap. If this value is zero, all colors are important.
RGBQUAD (3.0)
typedef struct tagRGBQUAD { /* rgbq */
BYTE rgbBlue;
BYTE rgbGreen;
BYTE rgbRed;
BYTE rgbReserved;
} RGBQUAD;
The RGBQUAD structure describes a color consisting of relative intensities of red, green,
and blue. The bmiColors member of the BITMAPINFO structure consists of an array of
RGBQUAD structures.
The rgbBlue specifies the intensity of blue in the color.
The rgbGreen specifies the intensity of green in the color.
The rgbRed specifies the intensity of red in the color.
The rgbReserved is not used; must be set to zero.
RGB (2.x)
COLORREF RGB(cRed, cGreen, cBlue)
BYTE cRed; /* red component of color */
BYTE cGreen; /* green component of color */
BYTE cBlue; /* blue component of color */
The RGB macro selects an RGB color based on the parameters supplied and the color
capabilities of the output device.
cRed Specifies the intensity of the red color field.
cGreen Specifies the intensity of the green color field.
cBlue Specifies the intensity of the blue color field.
SCALABILITY IN HETEROGENEOUS ENVIRONMENT(VIDEO COMPRESSION)
78
REFERENCES
[1] Vasudev Bhaskaran and Konstantinos Konstantinides, Image and Video
Compression Standards, Algorithms and Architectures – 2nd Edition, Kluwer
Academic Publishers.
[2] Woods and Gonzalves, Digital Image Processing.
[3] Bjarne Stroustrup, The C++ Programming Language – 3rd Edition, Addison
Wesley.
[4] ANSI/ISO C++ Professional Programmer's Handbook, Macmillan Computer
Publishing.
[5] Binh Nguyen, Linux Filesystem Hierarchy, Version 0.63
[6] Machtelt Garrels, Introduction to Linux –A Hands on Guide, Version 1.0
WEBSITES:
http://www.fastgraph.com/help/avi_header_format.html
http://www.fastgraph.com/help/bmp_header_format.html
http://pvdtools.sourceforge.net/aviformat.txt
http://pvdtools.sourceforge.net/bmpformat.txt
http://msdn.microsoft.com/library/default.asp?url=/library/enus/directx9_c/directx/ht
m/avirifffilereference.asp
http://msdn.microsoft.com/library/default.asp?url=/library/enus/directx9_c/directx/ht
m/bmpfilereference.asp
http://www.cs.wayne.edu/~dil/research/mdc/
http://dmoz.org/Computers/Multimedia/MPEG/
http://www.apl.jhu.edu/Notes/Geckle/525759/
http://www.autosophy.com/videcomp.htm
http://www.mathtools.net/Applications/DSP/Image_and_Video_Processng
http://aanda.u-trasbg.fr:2002/articles/astro/full/1999/09/ds1667/node2.html
http://www.nd.edu/~rls/Research/Compression/
http://www.ee.ualberta.ca/~mandal/index-info/img+vid.html
http://www.wave-report.com/tutorials/VC.htm
http://www.apl.jhu.edu/Notes/Beser/525759/icpvf02lect5.pdf