quiz on ch.2 convert 20102 3 to decimal. chapter 03 data representation

Quiz on Ch.2

Convert 201023 to decimal

Chapter 03

Data Representation

Data and Computers

Computers are multimedia devices,dealing with a vast array of information categoriesComputers store, present, and help us modify

• Numbers• Text• Audio• Images and graphics• Video

3

Data compression

Reduction in the amount of space (memory) needed to store the dataMeasured by the Compression ratio = The size of the compressed data divided by the size of the original data

Example: Two files are compressed with the ZIP utility:• One is originally 200 MB, and becomes 150 MB after compression• The other is originally 15 MB, and 11 MB after compression

Which file is better compressed?

4

Quiz

A video file is originally 3.5 GB long.We compress it to 490 MB. What is the compression ratio?

5

Data compression The Compression ratio is always between 0 and 1 (between 0% and 100%)

Compression techniques can beLossless → the data can be retrieved without any

loss of the original informationLossy → some information may be lost in the

process (but it doesn’t matter for the purposes of the intended application)

6

Analog vs. DigitalMany quantities of interest in the real-world are infinite and continuous• Zeno’s paradox: “That which is in locomotion must arrive at the half-way stage before it arrives at the goal. ” — Aristotle, Physics VI:9

But computers are finite and discrete! How do we represent an infinite and continuous quantity in a computer?

Answer: We approximate → represent only enough to satisfy our computational needs and our senses of sight and sound.

7

Information can be represented in one of two ways: analog or digital

Analog data A continuous representation, similar to the

actual information it represents

Digital data A discrete representation, breaking the

information up into separate elements

8

Analog and Digital InformationComputers cannot work well with analog data, so we digitize the data Digitizing = Breaking data into pieces and representing those pieces separately, by using a finite number of binary digits

Fine distinction: there are two operations performed:• one in time (a.k.a. sampling)• the other in amplitude (a.k.a. quantizing) Explain on thermometer, using a waveform example!

9

Quiz

A digital thermometer has a scale from 50 to 100 degrees (F). The temperature is represented on 7 bits. What is the smallest temperature difference that it can measure?

10

Analog and Digital Information

Why do we use binary to represent digitized data?• Price: transistors are cheap to produce

–Remember Babbage!• Reliability: transistors don’t get jammed

–Remember Babbage!

11

Electronic SignalsImportant facts about electronic signals• An analog signal continually fluctuates in voltage up

and down • A digital signal has only a high or low state,

corresponding to the two binary digits

12

Figure 3.2 An analog and a digital signal

• All electronic signals (both analog and digital) degrade due to absorption in transmission lines

• The amplitude (voltage) of electronic signals (both analog and digital) fluctuates due to environmental effects, a.k.a. noise

13

Figure 3.3 Degradation of analog and digital signals

The difference is that digital signals can be

regenerated!

Binary Representations

One bit can be either 0 or 1 •One bit can represent two things

Two bits can represent four things (Why?)

How many things can three bits represent?

How many things can four bits represent?

14

15

Why does the number of combinations double with every extra bit?


How many things can n bits represent?

What happens every time you increase the number of bits by one?

16EoL


How many things can n bits represent?

Reversing the problem: How many bits are needed to represent N things?• All desktops in this lab

17

QUIZ

How many bits are needed to represent all the courses you’re planning to take in college?

18

3.2 Representing Numeric Data

Negative integers

Signed-magnitude representation The sign represents the ordering, and the digits represent the magnitude of the number

19

Negative Integers

There is a problem with the sign-magnitude representation: plus zero and minus zero.• More complex hardware is required!

Solution: Let’s not represent the sign explicitly!“Complement” representation

20

Ten’s complement

Using two decimal digits, represent 100 numbers• If unsigned, the range would be 0…?

let 1 through 49 represent 1 … 49 let 50 through 99 represent -50 … -1

21

Ten’s complement

22

Top: representations (the “label on the jar)

Bottom: the actual numbers that are being

represented (the “content of the jar)

QUIZ

Given the following representations, find in each case what actual number is being represented:• 51• 52• 96 • 47

23

EXTRA-CREDIT QUIZ

If the representation is 76, what actual number is being represented?

24

Why the “complement” in ten’s complement?

100 – 50 = 50100 – 49 = 51…………………….. 100 – 1 = 99

In general:

100 – I = representation of – I

25

QUIZ

What is the representation for each of these actual numbers?• -48 • -40• -30• -5

26

Let’s use ten’s complement!To perform addition, add the numbers and discard any carry

27

Now you try it

48 (signed-magnitude)- 147

How does it work inthe new scheme?

Adding two negative numbers

28

Try these: 4 - 4 -4- 3 +3 + -3

Solve in notebook for next time:

• End of chapter 27 through 31• All examples and exercises presented in the

slides so far (text up to p.61, incl.) – Make sure to ask next time if you have questions!

29

30

Two’s Complement

What do you notice about the left-most bit (MSB)?

Two’s complement on 4 bits (k = 4)

31

Two’s complementAddition and subtraction are the same as in unsigned:

-127 1000 0001

+ 1 0000 0001

-126 1000 0010

Ignore any Carry out of the MSB:

-1 1111 1111 +

-1 1111 1111

-2 1111 1110

32

Two’s complementFormula to compute the negative of a number on k digits:• for ten’s comp: Negative(I) → 10k - I• for two’s comp: Negative(I) → 2k - I

Practice: find the 8-bit two’s comp. representations of:

-2

-51+ 128 (trick question!)- 129 (trick question!)0

33

Warning: The example on p.62 has a typo:

28 is 256, not 128.

QUIZ

What is the 8-bit two’s complement representation of these numbers?• -13 • 40

34

“Fast” two’s complementEasier way to change the sign of a number:Flip all bits, then add 1

Try it out! Find the negatives of the followingtwo’s complement numbers:

0000 00111000 00001000 00011000 00111001 01101111 1111

35

This is how subtraction is

implemented in computer hardware!

A – B = A + (-B)

QUIZ

Perform the following operation in 8-bit two’s complement:

40 – 13

36

What happens if the computed value won't fit in the given number of bits k?

Overflow

If k = 8 bits, adding 127 to 3 overflows 1111 1111

+ 0000 0011 0 1000 0010

… but adding -1 to 3 doesn’t!

Conclusion: overflow is specific to the representation (unsigned, sign-mag., two’s comp., floating point etc.)

37

Overflow is something we should expect when mapping an infinite world onto a finite machine!

38

Trick QUIZ

What decimal number does this binary number represent?

1001 1110

39

Solve in notebook for next time:

End of chapter 33, 40

40

Representing Real Numbers

Real numbers A number with a whole part and a fractional part

104.32, 0.999999, 357.0, and 3.14159

Positions to the right of the decimal point are the fractional part (tenths, hundredths, thousandths etc.): 10-1, 10-2 , 10-3 …

41

Binary fractions

Same rules apply in binary as in decimalDecimal point is actually the radix point Positions to the right of the radix point in binary are

2-1 (one half) = 0.52-2 (one quarter)= 0.252-3 (one eighth) = 0.125…

42

Converting fractions from binary to decimal

Easy! Just multiply with the powers of 2, as we did for unsigned binary. Only difference is that now the powers are negative.

Example: .10012 = 0. 10

43

QUIZ

Convert:

.10112 = 0. 10

44

Converting fractions from decimal to binary

Remember the repeated division algorithm?We apply it for the integer part of the number.

To covert the fractional part, we use the repeated multiplication algorithm!

Example: 0.43510 = 0. 2

45

QUIZ

Convert:

0.310 = 0. 2

46

QUIZ

Finite decimal fractions may have infinite binary representation!

0.310 = 0. 0100110011 2

47

Stop after 8 bits!

Representing Real Numbers

48

In general, we use “sign-mantissa-exponent”R = mantissa * 10exp

R = mantissa * 2exp

Depending on the form of the mantissa, we have:• Floating-point notation• Scientific notation

Floating point

49

This representation is called floating point because, although the number of digits is fixed (5 in the example above),the radix point floats (according to the exponent)

In the mantissa, the radix point is

always at the extreme right!

Floating point in binary?

50

It does exist - remember floating point numbers in Python!

It can be implemented either through software, or directly inthe hardware of the computer.

Until 1989, all desktops had an optional chip, separate fromthe main CPU, called FPU (a.k.a. math coprocessor).The Intel 80486, introduced that year, had the first integratedCPU + FPU.

Not in text!

80486 was also the first chip with over 1 million

transistors!

Floating point in binary?

51

Motorola launched their first integrated CPU+FPU, the68040 the next year, in 1990.

The detailed binary implementation of floating point isbeyond the scope of our class.

If you want to learn more:•Search for “IEEE 754” on the web•Take CS 343 or CS 344

Not in text!

Scientific notation

Particular form of floating point, in which the decimal point is kept to the right of the leftmost digit.

12001.32708 is represented 1.200132708 E+4

52

QUIZ

Convert to floating point and to scientific notation:

123.332 =

-0.0034 =

0.0 =

53

For next time:

Read the entire section 3.2Solve in the notebook problems 33, 34, 40

54

3.3 Representing Text

Basic idea:There are finite number of characters to represent, so list them all and assign each a (binary) number, a.k.a. code.

Character set A list of characters and the codes used to represent each one Computer manufacturers (eventually) agreed to standardize

– Read “Character Set Maze” on p.6755

The ASCII Character Set

ASCII = American Standard Code for Information Interchange

ASCII originally used seven bits to represent each character, allowing for 128 unique characters

Later extended ASCII evolved so that all eight bits were used

• How many characters can be represented?

56

7-bit ASCII Character Set

57

8-bit ASCII Character Set

58

Extended ASCII is a superset of 7-bit ASCII: The first 128 characters correspond exactly to 7-bit ASCII

QUIZEncode “Hello, world!” in ASCIIDecode 67 83 32 49 49 48 from ASCII

59

Python to the rescue!Encode “Hello, world!” in ASCII

60

The ASCII Character Set

The first 32 characters in the ASCII character chart do not have a simple character representation to print to the screen.

They are called control characters

61

ASCII Control Characters

62

The Unicode Character Set

Extended ASCII is not enough for international use

Unicode uses 16 bits per character How many characters can UNICODE represent?

Unicode is a superset of ASCII: The first 256 characters correspond exactly to the extended ASCII character set

63

Unicode examples

64Figure 3.6 A few characters in the Unicode character set

Romanian Characters in Unicode

65

Part of the Latin-Extended-B character sub-set

Miscellaneous Characters in Unicode

66

See more online at the official Unicode site

http://www.unicode.org/charts/

Text Compression

Sometimes, assigning 8 or 16 bits to each character in a document uses too much memory

We need ways to store and transmit text efficiently

Text compression techniques:– keyword encoding– run-length encoding– Huffman encoding

67

Keyword EncodingReplace frequently used words with a single character, for example here’s a substitution chart:

68

Keyword EncodingGiven the following paragraph:

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed, That whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their Safety and Happiness.

69

Keyword EncodingThe encoded paragraph is:

We hold # truths to be self-evident, $ all men are created equal, $ ~y are endowed by ~ir Creator with certain unalienable Rights, $ among # are Life, Liberty + ~ pursuit of Happiness. $ to secure # rights, Governments are instituted among Men, deriving ~ir just powers from ~ consent of ~ governed, $ whenever any Form of Government becomes destructive of # ends, it is ~ Right of ~ People to alter or to abolish it, + to institute new Government, laying its foundation on such principles + organizing its powers in such form, ^ to ~m shall seem most likely to effect ~ir Safety + Happiness.

70

Keyword EncodingHow much did we compress?

Original paragraph 656 characters

Encoded paragraph 596 characters

Characters saved60 characters

Compression ratio596/656 = 0.9085

Could we use this substitution chart for all text?

71

Keyword EncodingMore advanced idea:

Encode parts of words, like common prefixes and suffixes, e.g. “pre”, “ex”, “ing”, “tion”

72

Run-Length EncodingA single character may be repeated over and over again in a long sequence.Replace a repeated sequence with

– a flag character– repeated character– number of repetitions

*n8– * is the flag character– n is the repeated character– 8 is the number of times n is repeated

73

Run-Length EncodingEncoding example:Original text is

bbbbbbbbjjjkllqqqqqq+++++Encoded text is

*b8jjjkll*q6*+5 (Why isn't l encoded? J?)The compression ratio is 15/25 = .6

Q: This type of repetition doesn’t occur in English text; can you think of a situation where it is very likely to occur? 74

Run-Length EncodingDecoding example: Encoded text is

*x4*p4l*k7Original text is

xxxxpppplkkkkkkk

75

Run-Length EncodingMore advanced idea (not required for exam):

The number of repeated characters cannot be 0, 1, 2 or 3 (Why?)

We can shift the range 0 – 255 to represent run lengths between 4 - 259

Perform the encoding and decoding in the prev. examples with the convention above!

76

Huffman CodesLetter & Word Frequency distributions:

77

http://letterfrequency.org/

Huffman CodesConclusion: each language and each topic have specific frequencies of characters and groups of characters (digraphs, trigraphs etc.)

Why should the characters “X" or "z" take up the same number of bits as "e" or "t"?

Huffman codes use variable-length bit strings to represent each character. More frequently-used letters have shorter strings to represent them, and vice-versa!

78

Huffman encoding example

“ballboard” would be1010001001001010110001111011

compression ratio28/56

QUIZ: Encode “roadbed”

79

Huffman decoding

In Huffman encoding no character's bit string is the prefix of any other character's bit string. Codes with this property are called prefix codes.

To decodelook for match left to right, bit by bitrecord letter when the first match is foundcontinue where you left off, going left to right

80

Huffman decoding QUIZ

81

Decode:

1011111001010

For next time:

Read the entire section 3.3Solve in the notebook problems 49, 53a,b

82

3.4 Representing Audio Data

83

We perceive sound when a series of air waves cause

to vibrate a membrane in our ear (eardrum), which

sends signals to our brain.

Analog Audio

Record players and stereos send analog signals to speakers to produce sound.These signals are analog representations of the sound waves.The voltage in the signal varies in direct proportion to the amplitude of the sound wave.

84

From Analog to Digital Audio

Digitize the signal by sampling and quantizing – periodically measure the voltage – record the numeric value

How often should we sample?

A sampling rate of about 40,000 times per second is enough to create a reasonable sound reproduction

85

44,000 for audio CD, to be exact

Sampling and Quantizing

86

Some informationis lost, but areasonablesound is reproduced

Digital Audio on a CD

87

Figure 3.9 A CD player reading binary information

“pit”

“land”

Digital Audio on a CD

On the surface of the CD are microscopic pits and lands that represent binary digits

A low intensity laser is pointed as the disc. The

laser light reflects strongly if the surface is

smooth and poorly if the surface is pitted ???

88

89

Pit height is about ¼ the

laser’s wavelength

“destructive interference”

90

Both halves of the laser beam reflect off pit or both halves off land.The two halves are “in phase”.

Half of the laser beam reflects off pit and half off land.The 2 halves are “out of phase”.

Audio Formats

Audio Formats– WAV, AU, AIFF, VQF, and MP3

MP3 (MPEG-2, audio layer 3 file) is dominant – analyzes the frequency spread and discards information

that can’t be heard by humans (>16 kHz) – bit stream is compressed using a form of Huffman

encoding to achieve additional compression

Is this a lossy or lossless compression?

91

3.5 Representing Images and Graphics

Color

Perception of the frequencies of light that reach the retinas of our eyes

Retinas have three types of color photoreceptor cone cells that correspond to the colors of red, green, and blue

92

Color is expressed as an RGB (red-green-blue) value = three numbers that indicate the relative contribution of each of these three primary colors

An RGB value of (255, 255, 0) maximizes the contribution of red and green, and minimizes the contribution of blue, which results in a bright yellow.

93

94

Look at the snow and the barn!

Source: Wikipedia – RGB color model

http://upload.wikimedia.org/wikipedia/commons/2/28/RGB_illumination.jpg

http://upload.wikimedia.org/wikipedia/commons/c/ce/Barn_grand_tetons_rgb_separation.jpg

Representing Images and Graphics

Can you understand this HTML code?

<font color="#FF0000">

Blah blah …

</font>

95

RGB Color Chart in hex

http://www.yenra.com/rgb-hex-triplet-color-chart/

QUIZ

96

Explain the similarities and differences between 00FF00 and 008800

The color cube

97Figure 3.10 Three-dimensional color space

Representing Images and Graphics

color depthThe amount of data that is used to represent a color HiColor A 16-bit color depth: five bits used for each number in an RGB value with the extra bit sometimes used to represent transparency TrueColorA 24-bit color depth: eight bits used for each number in an RGB value

98

Extra-credit question

TrueColorA 24-bit color depth: eight bits used for each number in an RGB value

How many different colors can be represented in TrueColor? Please show your work.

99

QUIZ

100

Are these HiColor or TrueColor?

Indexed Color

A browser may support only a certain number of specific colors, creating a palette from which to choose

101

Figure 3.11 The Netscape color palette

EOL

Extra-credit question

How many bits are needed to represent this palette? Please show your work.

102

How to digitize a picture •Sample it → Represent it as a collection of individual dots called pixels•Quantize it → Represent each pixel as one of 224 possible colors (TrueColor)

Resolution = The # of pixels used to represent a picture

103

Digitized Images and Graphics

104

Figure 3.12 A digitized picture composed of many individual pixels

Wholepicture

Digitized Images and Graphics

105

Figure 3.12 A digitized picture composed of many individual pixels

Magnified portionof the picture

See the pixels?

Hands-on: paste the high-res image fromthe previous slide inPaint, then chooseZOOM = 800

Raster Graphics = Storage of data on a pixel-by-pixel basis

•Bitmap (BMP), GIF, JPEG, and PNG are the most popular raster-graphics formats

GIF formatEach image is made up of only 256 colors (indexed color)• But they can be a different 256 for each image!Supports animation! ExampleOptimal for line art

PNG format (“ping” = Portable Network Graphics) Like GIF but achieves greater compression with wider range of color depthNo animations

106

http://www.gifanimations.com/GA/animation/ImageDisplay/1/11/1

Bitmap formatContains the pixel color values of the image from left to right and from top to bottom• Great candidate for run-length compression!• Losless

JPEG formatAverages color hues over short distances• Lossy compressionOptimal for color photographs

107

Vector Graphics

A format that describes an image in terms of lines and geometric shapes

A vector graphic is a series of commands that describe a line’s direction, thickness, and color

The file sizes tend to be smaller because not every pixel is described

Example: Flash

108

Vector Graphics

The good side:Vector graphics can be resized mathematically and changes can be calculated dynamically as needed

The bad side:Vector graphics are not good for representing real-world images

109

3.6 Representing VideoThe problem: huge amount of data!

Example: In HDTV, the Frame size is defined as the number of horizontal pixels × number of vertical pixels:• 1280 × 720 • 1920 × 1080

Calculate: 1] Data rate (bits per second) for 25 fps 2] Size (bytes) of 2-hour movie

110

Video codec = COmpressor/DECompressor Methods used to shrink the size of a movie to allow it to be played on a computer or over a network

Almost all video codecs use lossy compression to minimize the huge amounts of data associated with video

111

Compressing Video

Temporal compression A technique based on differences between consecutive frames: If most of an image in two frames hasn’t changed, why should we waste space to duplicate all of the similar information?Spatial compression A technique based on removing redundant information within a frame: This problem is essentially the same as that faced when compressing still images.

112

To read:

113

Ethical Issues

MGM Studios, Inc. v. Grokster, Ltd.Describe the background for this

lawsuitWhat role did Napster play in this case?What was the decision in this case?Has this case affected you personally?

114

Who is Bob Bemer?

115

I love ASCII !

Do you know?

116

What is JamBayes?

How many computer character sets existedin 1960?

Who described the telegraph as a kind of very long cat?

Chapter Review Questions

• Distinguish between analog and digital information• Explain data compression and calculate compression

ratios• Explain the binary formats for negative and floating-

point values• Describe the characteristics of the ASCII and Unicode

character sets

• Perform various types of text compression

117

Chapter Review Questions

• Explain the nature of sound and its representation• Explain how RGB values define a color• Distinguish between raster and vector graphics• Explain temporal and spatial video compression

118

Homework for Ch.3Due next Wed, Feb 15

End-of-chapter 3:

Exercises 47, 48, 51, 52, 53, 62, 65, 66,68 through 72, 74, 76 through 79

119

quiz on ch.2 convert 20102 3 to decimal. chapter 03 data representation

Documents

digital data

digital analog data

data representation

digital signal slide

compressed data

digital signals

digital information

breaking data