lec2
DESCRIPTION
MultimideaTRANSCRIPT
![Page 1: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/1.jpg)
Digital Audio, Image and Video
Sethserey SAMComputer Science Department
![Page 2: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/2.jpg)
Digital Media
In computers, audio, image and video are stored as files just like other text files.
For images, these files can have an extension like– BMP, JPG, GIF, TIF, PNG, PPM, …
For audios, the file extensions include– WAV, MP3, …
The videos files usually have extensions:– AVI, MOV, …
![Page 3: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/3.jpg)
An Digital Image Example
Let’s open an image file is its “raw” format:
P6: (this is a ppm image)Resolution: 512x512Depth: 255 (8bits per pixel in each channel)
![Page 4: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/4.jpg)
An image containsa header anda bunch of (integer) numbers.
![Page 5: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/5.jpg)
Digital Media Capturing To get a digital image, an audio or a video clip, we need
some media capturing device such as– a digital camera or a scanner, – a digital audio recorder, – or a digital camcorder.
All these devices have to complete tasks:– Sampling: To convert a continuous media into discrete
formats.– Digitization: To convert continuous samples into finite
number of digital numbers.– There are probably some further compression process.
![Page 6: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/6.jpg)
An Audio Signal
![Page 7: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/7.jpg)
Sampling for an Audio Signal
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t
Amplitude
Sampling period Ts,fs =1/Ts
Signal Period T, f = 1/T
Intuitively T should >= 2Ts
![Page 8: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/8.jpg)
fs = 2.5f
fs = 1.67f
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t
Amplitude
Original signal
A new component is added
This is denotedas aliasing.
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t
Amplitude
![Page 9: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/9.jpg)
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t
Amplitude
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t
Amplitude
fs = 2f
There are infinite numberof possible sinwaves going throughthe sampling points
![Page 10: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/10.jpg)
Frequency Decomposition Any signal can be represented as the
summation of sin waves (possibly infinite number of them).
We can use “Fourier Transform” to compute these frequency components.
We can now extend our analysis to any signals.
If we have a signal has frequency components
{f1 < f2 < f3 … < fn} so what is the minimum sampling frequency we should use?
![Page 11: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/11.jpg)
Nyquist Theorem
Nyquist theorem – The necessary condition of reconstructing a
continuous signal from the sampling version is that the sampling frequency
fs > 2fmax
fmax is the highest frequency component in the signal.
– If a signal’s frequency components are restricted in [f1, f2], we need fs >2 (f2-f1).
![Page 12: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/12.jpg)
Image Sampling
The sampling theorem applies to 2D signal (images) too.
Sampling on a grid Sampling problem
![Page 13: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/13.jpg)
The image of Barbara
![Page 14: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/14.jpg)
Aliasing due to sampling
![Page 15: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/15.jpg)
Digitization The samples are continuous and have infinite number of
possible values.
The digitization process approximates these values with a fixed number of numbers.
To represent N numbers, we need log2N bits.
So, what determines the number of bits we need for an audio clip or an image?
![Page 16: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/16.jpg)
Digital Audio You often hear that an audio is 16bits at 44kHz.
44KHz is the sampling frequency. Music has more high frequency components than speech. 8kHz sampling is good enough for telephone quality speech.
16bits means each sample is represented as a 16bit integer.
Digital audio could have more than one channels.
![Page 17: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/17.jpg)
Digital Images
An image contains 2D samples of a surface, which can be representedas matrices. Each sample in an image is called a pixel.
![Page 18: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/18.jpg)
Types of Digital Images
Grayscale image– Usually we use 256 levels for each pixel. Thus we
need 8bits to represent each pixel (2^8 == 256)– Some images use more bits per pixel, for example
MRI images could use 16bits per pixel.
A 8bit grayscaleImage.
![Page 19: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/19.jpg)
Binary Image
A binary image has only two values (0 or 1).
Binary image is quite important in image analysis and objectdetection applications.
![Page 20: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/20.jpg)
Bit Plane
[ b7 b6 b5 b4 b3 b2 b1 b0]
MSB LSB
Each bit plane is a binary image.
![Page 21: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/21.jpg)
Dithering
A technique to represent a grayscale image with a binary one.
0 1
2 3
Convert image to4 levels: I’ = floor(I/64)
![Page 22: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/22.jpg)
Dithering Matrix
Dithering matrix
0 1
2 3
0 23 1
The dithering matrix is
![Page 23: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/23.jpg)
Color Image
r
g
b
There are other color spacesLike YUV, HSV etc.
24 bit image
![Page 24: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/24.jpg)
Color Table
Image with 256 colors
r
g
b
Clusters of colors
It is possible touse much less colorsTo represent a color imagewithout much degradation.
![Page 25: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/25.jpg)
Human Vision
Human eye has two kinds of light sensitive cells. The rods andThe cones.
Rods response curve(black and white vision)
Cones response curve(color vision)
R = s E() Sr()dG = s E() Sg()dB = s E() Sb()d
![Page 26: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/26.jpg)
Colors
Color matching function
Colorimeter experiment
![Page 27: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/27.jpg)
CIE Color Matching Functions
The amounts of R, G, B lighting sources to form single wavelength light forms the color matching curves.
CIE color matching curves CIE standard color matching functions.
![Page 28: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/28.jpg)
Gamma Correction
Most display device’s brightness is not linearly related to the input.
I’ = I
To compensate for the nonlinear distortion we need to raise it to a power again
(I’)1/ = I
for CRT is about 2.2.
![Page 29: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/29.jpg)
Gamma Correction
Linearly increasing intensityWithout gamma correction
Linearly increasing intensitywith gamma correction
![Page 30: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/30.jpg)
Video Analog video
Even frameOdd Frame
52.7us
10.9us
0v
white
black
![Page 31: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/31.jpg)
Digital Video
Frame N-1
Frame 0
time
Digital video is digitizedversion of a 3D functionf(x,y,t)
![Page 32: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/32.jpg)
Color System in Video
YUV was used in PAL (an analog video standard) and also used for digital video.
Y is the luminance component (brightness) Y = 0.299 R + 0.587 G + 0.144 B U and V are color components U = B – Y V = R - Y
Y U V
![Page 33: Lec2](https://reader035.vdocuments.site/reader035/viewer/2022070316/55583677d8b42acb078b4885/html5/thumbnails/33.jpg)
YIQ is the color standard in NTSC.
YCbCr: A color system used in JPEG.
I Q