multimedia retrieval architecture anandi giridharan electrical communication engineering, indian...

Multimedia Retrieval Architecture

Anandi GiridharanElectrical Communication Engineering,

Indian Institute of Science, Bangalore – 560012, India

Multimedia Storage Techniques


Media and Storage Requirements

Characteristics of multimedia data their storage requirement.

Multimedia data tends to be voluminous. eg.100min of video compressed using JPEG compression algorithm requires 9GB of storage space. Most storage systems do not provide for such large continuous locations.

Continuous media data, such as video and audio have timing characteristics associated with them.

In Real time data need to be collected without losing a portion of the data. Imposes timing constraints on multimedia data.


Media requirements of MM applications and

storage space.


Multimedia Standards

• A standard implies consistency and conformity, which means they facilitate interoperability and compatibility.

• Standards in computing are developed to solve problems:

– Interoperability – allow systems to communicate with each other (e.g., TCP/IP)

– Portability – allowing software to work on different systems (e.g., Java)

– Data exchange – allowing data to be transferred to different systems (e.g., JPEG)

• Factors to consider: Lifetime, Portability and Costs


Storage Structures of Video Data• In digital video 4 types of control information have to

considered for smooth running of any mm information

• Control Information

• Frame Rate:

• Video is made up of 30 (or 24) pictures or frames for every second of video.

• Frames are split in half (odd lines and even lines), to form what are called fields.

• Interlaced video: When a television set displays its analogue video signal, it displays the odd lines (the odd field) first. Then it displays the even lines (the even field).

• Non-Interlaced Video: Computer monitor uses “progressive scan" to update the screen. Computer displays each line in sequence, from top to bottom.


Interlaced video on the left, deinterlaced video on the right. - See more at: http://www.streaminglearningcenter.com/articles/shooting-for-streaming---progressive-or-interlaced.html#sthash.iAFBM02x.dpuf


Storage Structures of Video Data• Control Information

• Color Resolution:

– Color resolution refers to the number of colors displayed on the screen at one time

– RGB (red-green-blue) and YUV (luminance component (the brightness) and U and V chrominance (color) components)

• Spatial Resolution:

– “How big is the picture?” Resolution

• Image Quality:

– Video should look acceptable for an application.


Spatial resolution is a parameter that shows how many pixels are used to represent a real

object in digital form. Fig. 2 shows the same color image represented by different spatial

resolution. Left flower have a much better resolution that right one


Video Data Compression• Factors associated with compression

– Real-Time versus Non-Real-Time

• Some systems compress to disk, decompress and playback video (30fps) all in real time. There are no delays. Other systems are only capable of capturing some of the 30fps and are capable of playing back some of the frames not all .

– Symmetrical Versus Asymmetrical

• Symmetrical: if a sequence of 640x480 can be played at 30 fps, capturing, compressing and storing is also possible at the same rate.

• Opposite of Asymmetrical. It takes lot longer, elaborate.


Compression Ratios

The numerical representation of the original video in comparison to the compressed video. eg.200:1 compression ratio means that the original video is represented by the number 200 and compressed video is represented by smaller number in this case 1.


Lossless Versus Lossy

Loss factor determines whether there is a loss of quality between the original image and the image after it has been compressed.

With lossless compression, every single bit of data that was originally in the file remains after the file is uncompressed. All of the information is completely restored.

lossy compression reduces a file by permanently eliminating certain information, especially redundant information. When the file is uncompressed, only a part of the original information is still there (although the user may not notice it).


Examples of Lossless and lossy (200:1) images decoded

from the same file.


Video Data Compression• Interframe Versus Intraframes

– Intraframe method compresses and stores each video frame as a discrete picture

– Interframe method: Reference Frame and the differences between frames are recorded.

• Bit Rate Control

– Parameters such as frame rate, quality of the images should be allowed to be modified w.r.t. the application requirements

• Selecting a Compression Technique

– Motion JPEG, MPEG-1, MPEG-2, so on up to MPEG-7 and MPEG-2000 are internationally recognized standards for compression of moving pictures.


Data compression

Data Compression coverts an Input data stream into another stream of smaller size.

Process of reducing the amount of data needed for

storage typically by use of encoding techniques.

Compression helps in reducing storage space

Reduce bandwidth

Lower cost

Used in new applications.


Audio Compression

• Predictive encoding: Difference between samples are encoded instead of absolute sample values resulting in lower bit rates. Compression is not that high.

• Perceptual encoding: It makes use of the flaws in our auditory system based on the study of how people perceive sound

Ear's sensitivity to sound Is not uniform2 to 4 kHz ear is sensitiveHigher or lower ranges not sensitive.Audio samples that are below the threshold can be deleted.Some sound can mask other sounds.


Lossy Audio compressionSounds are masked by other sounds.

• Frequency masking: A loud sound in Frequency range can partially or fully mask another sound in nearby frequency range.

• Temporal masking: Loud sound can numb our ears for short duration even after sound has disappeared.


MPEG-1 Audio Compression• Sampling is done at 32KHz, 44.1 KHz or 48 KHz.

• 44.1 KHz for CD quality audio.

• Signal is converted from Time domain to frequency domain using Fast Fourier Transform,

• Resulting Spectrum is divided into at-most 32 frequency bands each of which are processed separately.

• Frequency ranges that are to be completely masked are allocated zero bits

• That are to be partially masked are allocated small number of bits

• That are not to be masked are allocated large number of bits.

• In case of stereo, redundancy are similar in two audio sources are exploited.


Video Compression• Video is temporal combination of frames

• Each frame can be considered as an still image comprising of spatial combination of pixels.

• Two principles:

• Joint photographic expert group: is used to compress images by removing spatial redundancy that exists in each frame.

• Moving Picture Expert Group: is used to compress video by removing temporal redundancy of a set of frames.


JPEG• JPEG involves four steps

– Block preparation

– Discrete cosine transformation

– Quantization

– Compression


Phases of JPEG


Block preparation• Block preparation: After video signal is digitized , is

converted to array of pixels. i.e. 640*480 pixel

• Each pixel has RGB components each 8 bits totally 24 bits/ pixel.

• Before compression it is converted to Luminance (brightness) (more sensitive to our eyes) Chrominance (color)

• Chrominance is very sensitive to our eyes so sent with lesser resolution. It is compactable with Black and white picture

• Allows more compression so in YUV.

• Y=0.30R+0.59G+0.11B

• U=-0.18R-0.29G+0.44B

• V=0.62-0.52G-0.10B


Discrete Cosine transform• Each block of 64 pixels goes through a

transformation called DCT

• Example: with uniform intensity

It has only one DC component and Other ac componets

It has one Dc and few AC Number of zeros are more.

Example 2: With 2 different intensities


Quantization

Further increasesNumber of zeros


Zig Zag scanning

• To compute all the zeros together and sent in compact number as fewer number.


MPEG-1

• First standard that finalized for video compression for interactive video on CD and digital audio Broadcasting.

• VCR quality 640*480 pixel , 24 bits/pixel, 25 frame /sec gives 368.64 Mbps (UC)

• After MPEG-1 compression gives 1.5 Mbps.

• It is likely to dominate the encoding of CDROM based movies, gives good quality movie.

• It can be used to transmit over twisted pair for modest distance (5km)


MPEG-1• It has 3 components, Audio, Video and system,

• 90 KHZ clock outputs the current time valve (time stamps) to both the encoders and propagated all the way to receiver.,

Audio signal

SystemMultiplexerclock

Audioencoder

Videoencoder

90KHzMPEG-1

Video signal


MPEG-1 Video compressing• Encoding each frame separately with jpeg removes

spatial redundancy.

• Additional compression can be achieved by taking advantage of the fact that consecutive frames are often almost identical.


• MPEG-1 has 4 kinds of frames for motion compensation.

• (Difference between 2 frames are computed)

• P frame(Predictive)- Uses Block by block difference with preceding I and P.

• B (Bidirectional)- Difference with preceding and following I or P frames are used as references

• I (Intracode)- Self contained JPEG encoded appears periodically and can be decoded independently.

• D (DC coded) frames- Block average used for fast Farward..


MPEG Frames


Frame construction


MPEG-2

• Similar to MPEG-1

• Developed for Digital TV

• No fast forward , not supporting D frames

• DCT-10*10 instead of 8 * 8

• For better quality

• Supports 4 resolutions and 5 profiles.

• Has a more general way of multiplexing

• Each streams are packetized with time stamps


MPEG-4• Started for low bit rate

• For used in portable like video phone

• Standard includes much more than just data compression

• Functionality: Content based MM access tools

• Manipulation and Bit stream editing

• Improved temporal random access

• Robustness in error prone environment

• Content based scalabilty.


H 261• H.261 is a ITU-T video coding standard.

• H.261 was originally designed for transmission over ISDN lines on which data rates are multiples of 64 kbit/s.

• The coding algorithm was designed to be able to operate at video bit rates between 40 Kbit/s and 2 Mbit/s.

• The standard supports two video frame sizes: CIF (Comman Intermediate format) and QCIF (Quarter CIF) using a 4:2:0 sampling scheme.

• Both encoder and decoder should be v.fast used for interactive VC, real time.

multimedia retrieval architecture anandi giridharan electrical communication engineering, indian...

Documents

storage systems

continuous media data

java data exchange

gb of storage space

different systems

real time data need

timing characteristics

jpeg factors