hypertext (1)

106
Hypertext (1) • Historically, text is sequential: read from beginning to end • Hypertext is non-sequential, with internal links from one part to another Hypertext, the word, coined by Ted Nelson in 1966. • First hypertext system, Xanadu, named for Coleridge’s magical world.

Upload: whitley

Post on 27-Jan-2016

97 views

Category:

Documents


0 download

DESCRIPTION

Hypertext (1). Historically, text is sequential: read from beginning to end Hypertext is non-sequential, with internal links from one part to another Hypertext, the word, coined by Ted Nelson in 1966. First hypertext system, Xanadu, named for Coleridge’s magical world. Hypertext (2). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hypertext (1)

Hypertext (1)

• Historically, text is sequential: read from beginning to end

• Hypertext is non-sequential, with internal links from one part to another

• Hypertext, the word, coined by Ted Nelson in 1966.

• First hypertext system, Xanadu, named for Coleridge’s magical world.

Page 2: Hypertext (1)

Hypertext (2)

Links in hypertext give access to:

• topics or information directly related to the current idea

• notes, such as footnotes or endnotes

• explanations of special words or phrases

• biographical information about people behind the current idea

Page 3: Hypertext (1)

Claims about Hypertext

• Represents large body of information organized into numerous fragments

• Fragments relate to one another

• User needs only a small fraction of the fragments at any time

• Exists only in cooperation with the reader

• Is a legitimate literary concept

Page 4: Hypertext (1)

Claims about Hypertext (2)

• Integrates three technologies– Publishing (as a book publisher would)– Computing (as the infrastructure)– Broadcasting (over a computer network)

• Depends on computer environment for high-speed transitions between nodes

• Modelled by network ADT

Page 5: Hypertext (1)

Using Hypertext

• Browser, or hypertext engine: a computer-based system that allows links to be followed easily

• Navigation aids: parts of the user interface that provide a sense of location and direction

• Notation: a convenient way of specifying links as a hypertext author

Page 6: Hypertext (1)

WWW as a Hypertext System

• Browser: Netscape, for example

• Navigational aids:– Forward, back, home– History list– Colored anchors– Consistent titles

• Notation: HTML

Page 7: Hypertext (1)

Network ADT

• Model of hypertext

• Similar to tree ADT, but allows cycles

• Links have an explicit direction, capturing the idea of going forward and going back

Page 8: Hypertext (1)

Network ADT (2)

• Definition: A network is a collection of nodes and links between pairs of nodes such that– Each link has a direction.– Each node is reachable from any other node.

However, the path is not necessarily unique.– No node is linked to itself.– There are no duplicate links in the same

direction.

Page 9: Hypertext (1)

Network ADT (3)

• Observations:– There is no hierarchy; all nodes are considered

the same. (In a tree, the root is special.)– Links have direction, but reverse travel is

possible. (One can go backwards on a link, or forwards on a link that goes in the opposite direction.)

– Cycles are allowed.

Page 10: Hypertext (1)

Directed Graphs

• Both networks and rooted trees are examples of a connected directed graph, sometimes called a digraph.

• Formally, a digraph is a set of nodes and a set of links joining ordered pairs of nodes. The link (A,B) that joins A to B is different from the link (B,A) that joins B to A

Page 11: Hypertext (1)

Navigation in Sequential Text

• Low level:– Punctuation– Fonts– Separation into sentences and paragraphs

• High level:– Chapters, sections, subsections– Table of contents– Index

Page 12: Hypertext (1)

Navigation in Sequential Text (2)

• Page layout– Page numbers– Running heads– Displayed text

Page 13: Hypertext (1)

Navigating in Hypertext

• Issues:– Where am I? Have I been here before? When?

– How did I get here?

– Where can I go?• Anchors (or links)

• Implicit anchors (or links): clipboard, glossary, calculator

• Computed links: next train

• Back

• Forward

• Home

Page 14: Hypertext (1)

Navigating in Hypertext (2)

• Within a node:– Save to disk– Print– Annotate– Scroll– Zoom

Page 15: Hypertext (1)

Navigating in Hypertext (3)

• User interface support– Give power to the users through

• short response time

• low cognitive load

• path clues, perhaps decaying over time

– Follow a path forward or backward– Return to a node

Page 16: Hypertext (1)

Text Markup

• Unified view of text and hypertext presentation

• Foundation of all word processors

• Describes all electronic manuscripts by– separating logical elements– specifying processing functions for these

elements

Page 17: Hypertext (1)

Text Markup (2)

• Originated by William Tunnicliffe (Sept. 1967), in talk advocating separating information content of document from format

• Control formatting with embedded codes

Page 18: Hypertext (1)

Generalized Markup

• Goal: allow editing, formatting, and retrieval systems to share documents

• Devised by Goldfarb, Mosher, Lorie at IBM, 1969

• Formally defined – document types– explicit nested element structure– generic identifier associated with each element

Page 19: Hypertext (1)

SGML

• Standard Generalized Markup Language

• First draft standard, 1980

• ISO 8879, 1986

• Based on the ADT tree

• Allows the description of a document, considered as a tree, to be embedded in the file containing the document

Page 20: Hypertext (1)

Functions of SGML

• Tags documents in a formal language

• Describes internal logical structures

• Links files with an addressing scheme

• Acts as a database language for text

• Accommodates multimedia and hypertext

• Provides a grammar for style sheets

• Allows coded text reuse in surprising ways

Page 21: Hypertext (1)

Functions of SGML (2)

• Represents documents independent of computing platform

• Provides a standard for transfering documents among platforms and applications

• Acts as a metalanguage for document types

• Represents hierarchies

• Extends to accommodate new document types

Page 22: Hypertext (1)

Generic Identifiers

• Tagging vs. formatting– Tagging shows document structure– Formatting describes document display– Example: A paragraph is a sequence of closely

connected sentences and can be delimited by a tag. A paragraph can be displayed with either

• initial indenting or not

• extra separation or not

Page 23: Hypertext (1)

Generic Identifiers (2)

• Syntax– Beginning: < identifier >– End: </ identifier >

• Attribute list, with assigned values, may follow identifier

Page 24: Hypertext (1)

Generic Identifiers (3)

• Typical identifiers:– p paragraph– q quotation– ol numbered (ordered) list– ul unnumbered list– li list item– b bold face– i italics

Page 25: Hypertext (1)

Display of Text

• ASCII codes for printing characters carry no information about display

• Printed or displayed characters are described by their font.

Page 26: Hypertext (1)

Fonts• Fonts come in families, which are a group of fonts

with similar design characteristics.• A font is a set of displayed characters in a

particular design. To describe a font, we specify:– The font face, or type face, which is the design of the

font.

– The size, measured in points, which is the height of representative characters.

– The appearance: bold, italic, underline, outline, shadow, small cap, redline, strikeout, etc.

Page 27: Hypertext (1)

Fonts (2)

• Font families include standard modifications of a base font, such as italics and bold, to change the appearance. (This family is Times New Roman.)

• Some families are sans serif, without the cross strokes accentuating the ends of the main strokes.

Page 28: Hypertext (1)

Fonts (3)

• Typical examples of fonts are– Times New Roman

– Arial– Century Schoolbook– Lucinda Calligraphy– Verdana

Page 29: Hypertext (1)

Fonts (4)

• The size of this font is 32 points

• This is 54 points• This is 24 points

• There are exactly 72.27 points per inch

Page 30: Hypertext (1)

Fonts (5)

To render a character in a font, one must

• Know the computer code (ASCII) of the character

• The font name and properties

Then the computer creates the glyph that represents the character in the specified font.

Page 31: Hypertext (1)

Fonts (6)

In the process, the computer uses the• Baseline: the invisible line on which

characters are aligned.• x-height: the actual height of the character x• Kerning: spacing between two letters.

Note that in printing “wo” the “o” slides under the “w”

to form and locate the glyph

Page 32: Hypertext (1)

Input devices for text

• Keyboard

• Scanning with optical character recognition– Hand printed – Hand written (cursive)– Machine printed

• Voice recognition

• Pen-based

Page 33: Hypertext (1)

Input errors

• Human-based, e.g.– Typographic– Poor writing

• Machine dependent– Small typeface differences: O vs. D

• Limits of technology

• Pre-existing errors

Page 34: Hypertext (1)

Automatic error correction

• Error rate for keyboard input = 98% OCR accuracy + automatic correction

• Automatic correction also helpful in:– Computer-aided authoring– Communication enhancement for disabled– Natural language responses– Database interaction

• Example: MS Word AutoCorrect

Page 35: Hypertext (1)

Automatic spelling correction

• Three increasingly difficult tasks:– Non-word detection: string in text not in

dictionary– Isolated word correction: thier automatically

becomes their– Context-dependent correction: here

automatically becomes hear

Page 36: Hypertext (1)

MS Word AutoCorrect

Page 37: Hypertext (1)

General spelling correction

• Can allow human intervention, e.g. choose the correct spelling from a list of candidates

• No context dependent general purpose correction tool exists yet.

Page 38: Hypertext (1)

Issues for spelling correction

• Type of input device– Focus on adjacent keys: b vs. n– Focus on similar shapes: O vs. D

• Interactive vs. automatic correction– How many choices are reasonable? (One for

automatic correction.)– How accurate should guesses be?

• Proper choice of dictionary

Page 39: Hypertext (1)

Proper Dictionary

Page 40: Hypertext (1)

Word list choice

• Use lexicon--a word list appropriate to a particular topic

• As opposed to dictionary -- a comprehensive list of words

• Include provision for adding new words

Page 41: Hypertext (1)

Word list choice: Example 1

• Compare NY Times news wire text with Webster’s 7th Collegiate Dictionary

• 8 million words in news wire text:– only 36% in dictionary– only 39% of dictionary words used in text

Page 42: Hypertext (1)

Example 1 (continued)

• Of text words not in dictionary– 1/4 inflected forms (change in case, gender, tense)– 1/4 proper names– 1/6 hyphenated forms– 1/12 misspellings– 1/4 unresolved by investigators (new words, etc.)

• How to handle proper names?

Page 43: Hypertext (1)

Example 2

• Corpus of 22 million words from a variety of genres

• Effect of changing lexicon from 50,000 to 60,000 words?– Eliminated 1348 false rejections (words are now

included in lexicon)– Created 23 false acceptances (originally

misspelled, now occur in lexicon and therefore, treated as correctly spelled.)

Page 44: Hypertext (1)

Unintentionally correct spellings

• Misuse of word: there for their, to for too

• Typo: from for form

• Quote from Mozart: I’ll see you in five minuets

Page 45: Hypertext (1)

Issues in detection

• Given document as a sequence of words, lexicon as ordered list of words, report all document words not in lexicon, but:

• How to handle upper case letters?

• How to handle suffixes and prefixes?

• What definition of word to use?

Page 46: Hypertext (1)

Issues in detection (2)

• Upper case: Change all to lower case– Handles first word of sentence and proper

names that are words: Bob Brown– Confuses: DEC (ok), Dec (abbreviation), dec

(misspelling) – Must put back capitalization

Page 47: Hypertext (1)

Types of errors

• From keyboard input, 80% of misspellings– Insertion– Deletion– Substitution, especially nearby keys– Transposition

• Few errors occur in first letter

• Mostly, length is same or changes by 1

Page 48: Hypertext (1)

Suggestion Strategies

• Words with same first letter first

• Order rest by change in length

Page 49: Hypertext (1)

Types of errors (2)

• Improper spacing: run-ons or splits– Significant unsolved problem

• Cognitive– recieve for receive; procede for proceed– conspiricy for conspiracy; mispell for misspell

• Phonetic– abiss for abyss; nacherly for naturally

Page 50: Hypertext (1)

Spelling Rules

• I before E except after C

• Ex, Suc, Pro ceed. All others are cede, except supersede

Page 51: Hypertext (1)

Suggestion Strategies (2)

• Words with same first letter first

• Order rest by change in length

• Use standard spelling rules

Page 52: Hypertext (1)

Suggestion Principles

• Edit distance: The minimum number of insertions, deletions, or substitutions needed to change one string to another, defined by Levenshtein in 1966

• Provide suggestions in increasing order of edit distance

Page 53: Hypertext (1)

Detection Algorithms

• For each word in text, search for word in dictionary. If not found, report spelling error.

• Issues:– Efficiency when text or dictionary is large

Page 54: Hypertext (1)

Detection Algorithms (2)

• n-gram analysis

• Issues:– Requires preprocessing of dictionary– Extremely fast if misspelling creates unusual n-

gram

Page 55: Hypertext (1)

n-gram Fundamentals

• Definition: an n-gram is a substring of length n of a given word.

• Examples:– The word weasel contains 5 digrams (2-grams),

namely we,ea,as,se,el.– The word monkey contains 4 trigrams (3-grams),

namely mon, onk, nke, key.– The word turkey contains 6 monograms (1-grams),

namely t,u,r,k,e,y.

Page 56: Hypertext (1)

n-gram Strategy

• Preprocess the dictionary to create a list of all the n-grams contained in words in the dictionary.– Eliminate duplicates from the list– Perhaps record the position within the word of the

n-gram.

• Detect a spelling error by discovering an n-gram in the target word that is not in the n-gram list.

Page 57: Hypertext (1)

Arrays

• Definition: A data structure is a particular way of storing data in a computer.

• Definition: An array is an indexed set of values. Informally, an array can be viewed as a table.

• Example (of a data structure): An array is a data structure.

Page 58: Hypertext (1)

Arrays (2)

• Array index:– Usually positive integers to some maximum

size, e.g. 1 to 500.– Can also be another ordered set, e.g. the

alphabet, the characters in ASCII order

• Values: Whatever one wants to store: numbers, letters, strings, other arrays.

Page 59: Hypertext (1)

Array Examples

• Table of hex and binary numbers corresponding to base 10 numbers. The index set is the base 10 numbers, the array values (table entries) are the corresponding hex and binary numbers

• List of words for searching. The index is the position in the list, the array values are the words viewed as strings.

Page 60: Hypertext (1)

Array Examples (2)

• Shift table for Boyer-Moore searching. The index is the set of characters. The array value is the number representing the shift amount for that index character.

• List of ASCII codes. The index is the ASCII code, 00 to FF in hex numbers. The array value is the character represented by the index hex number.

Page 61: Hypertext (1)

Digram Arrays

• A digram array is an array indexed by the letters a through z. Each value is, in turn, an array indexed by the letters a through z.

• A digram array can be viewed as a table whose rows and columns are indexed by the 26 lower case letters.

• Typically, we use binary digits as the values in a digram array, creating a binary digram array, or BDA.

Page 62: Hypertext (1)

Digram Arrays (2)

• Assume that a dictionary is given.

• Preprocess the dictionary by setting the value in a digram array for each digram that appears in each word in the dictionary.

• Notes:– The digram array depends on the dictionary– Typically 42% of entries are 0– Trigram arrays may be constructed in the same way.

Page 63: Hypertext (1)

Nonpositional BDA

• Each value, or cell, in a BDA is associated with the digram represented by the row and column index of the cell.

• Example: The digram ck is associated to the value in the cell in row c, column k.

• The value in a nonpositional BDA associated to a digram is 1 if that digram appears in some word in the dictionary and is 0 otherwise.

Page 64: Hypertext (1)

Nonpositional BDA (2)

• Example: The value associated with the digram ck is 1 if some word containing ck appears in the dictionary (e.g. cuckoo). The value is 0 if no word in the dictionary contains ck.

• Example: If the word whose spelling is being checked contains the digram mv and the value associated with this digram is 0, then the word does not appear in the dictionary.

Page 65: Hypertext (1)

Nonpositional BDA (3)

• Example: If the word whose spelling is being checked contains the digram gh and the value associated with this digram in the array is 1, then one cannot say whether the word is spelled correctly, based just on this information.

Page 66: Hypertext (1)

Example: Moby Dick

• Class examined Chapters 31-93• Summary file contains

– 284,591 characters– 63,851 words– 63,853 sentences– 63,585 lines– 63,583 paragraphs– 1413 pages

Page 67: Hypertext (1)

Example: Moby Dick (2)

• After processing (removing numbers, upper case letters, and punctuation), file contains– 70039 characters– 9578 words– 9577 sentences– 9577 lines– 9577 paragraphs– 213 pages

Page 68: Hypertext (1)

Example: Moby Dick (3)

• Checking digrams, we find

Page 69: Hypertext (1)

Positional BDA

• Assume that the longest word in the dictionary has length M.

• Denote the position of a digram by k. Then k has value 1, 2, ... , M-1.

• For each digram, create an array of length M-1, where the value at index k is 1 if the digram appears in a word in the dictionary in position k. The value is 0 if no word in the dictionary has this digram at position k.

Page 70: Hypertext (1)

Positional BDA (2)

• Example: In the positional BDA for the digram at the value indexed by k=3 is equal to 1 if some word in the dictionary has the form ??at*

• Example: In the positional BDA for the digram sp the value indexed by k=7 is equal to 0 if no word in the dictionary is of the form ??????sp*

Page 71: Hypertext (1)

Effectiveness

• Typically about 42% of entries in a non-positional BDA are 0

• Randomly changing one letter in a word will produce a digram with value 0 in NP BDA about 70% of time

• Study of handprinted 6-letter words, 7662 with a single substitution error, 7561 detected by positional trigram analysis

Page 72: Hypertext (1)

Encryption

• Goal: provide privacy and security for text transmitted by computer network.– Confidentiality of contents

– Authenticity of sender and receiver

– Integrity of contents

• Interested parties– Military and diplomatic officers

– Mathematicians and computer scientists

– E-commerce providers

Page 73: Hypertext (1)

Encryption History

• Early work– Cryptography book by George Fisher published

by Benjamin Franklin

• Present day– Text transmitted by computer network– Techniques regulated by federal government

Page 74: Hypertext (1)

Encryption on Networks

• Situation: no transmission on any computer network can be considered absolutely private– Network tap is not physically difficult– Legitimate use for monitoring traffic to detect

problems and potential bottlenecks

Page 75: Hypertext (1)

Intruders

• Passive: listens, gathers information

• Active: captures and (perhaps) replaces– Changes amount in a financial transaction– Uses a stolen credit card number

Page 76: Hypertext (1)

Encryption Model

Page 77: Hypertext (1)

Encryption Techniques

• Character-based– Shift (Caeser cipher)– Monoalphabetic substitution (cryptograms)– Polyalphabetic cipher

• Numeric– Each character is represented by 8 bits– Four characters form a 32 bit number– Encode these numbers

Page 78: Hypertext (1)

Shift Encryption

• Encryption: Each letter is encoded with the letter k positions from it in the alphabet

• Key: The integer k, in the range –25..25

Page 79: Hypertext (1)

Shift Encryption (2)

• Example 1: Shift

Replace each letter by the one three positions forward in the alphabet, k=+3

WILDCATS ---> ZLOGFDWV

• Example 2: Shift, k = +5

CATS ---> HFYX

Decrypt using k = –5

Page 80: Hypertext (1)

Shift Encryption (3)

Notes on shift encryption

• Only 26 different strategies are possible, and one of those is the null strategy (no encrypting is done).

• If encryption uses the key k, then decryption uses the key –k (or the key 26 – k)

Page 81: Hypertext (1)

Monoalphabetic Substition

• Encrypt by using a random permutation of the alphabet.

• Key is the permutation, 26! choices are available.

• Decryption by checking all permutations is impossible.

• However, this is the Daily Cryptogram in the newspaper.

Page 82: Hypertext (1)

Monoalphabetic Substitution (2)

• Example:

XE XU BRUXBF EM EROJ EARI EM

IT IS EASIER TO TALK THAN TO

AMOP MIB’S EMIWCB

HOLD ONE’S TONGUE

Page 83: Hypertext (1)

Monoalphabetic Substitution (3)

• Notes on monoalphabetic substitution– Decryption strategy uses letter patterns, e.g.

common digrams and trigrams– Heuristics, as opposed to an algorithm

Page 84: Hypertext (1)

Polyalphabetic Substitution

• Caesar cipher has too few keys

• Monoalphabetic substitution has enough keys, but word patterns (digrams and trigrams) allow easy code breaking

• Develop strategy with – large number of keys– disrupted word patterns

Page 85: Hypertext (1)

Polyalphabetic Substitution (2)

• Start with a 26 x 26 array of letters, shifted by one letter in each row

• Choose a string as a key• Example: key = springforward spr ingforwardsp ringf or ward The confidential terms of your springforw ardsprin gfo rw ardspri employment contract are as follows

Page 86: Hypertext (1)

Polyalphabetic Substitution (3)

• The ith character in the text, denoted by c, is replaced by m(d,c), where d is the corresponding character in the key, and the replacement is the character m, appearing in the dth row and cth column of the array.

• Example: d = s, c = t, m(d,c) = l d = p, c = h, m(d,c) = w d = r, c = e, m(d,c) = v

Page 87: Hypertext (1)

Polyalphabetic Substitution (4)

• The encoded message starts

lwvkb tkwua nklsa kmesx cwuol uwbgt …

where the letters have been written in groups of five.

Page 88: Hypertext (1)

Polyalphabetic Substitution (5)

• To decode a message, knowing the key, match the key with the message.

• Example: key = declaration

decla ratio ndecl arati ionde

zlgyi etamq bxvup owhnu cahzg

Page 89: Hypertext (1)

Polyalphabetic Substitution (6)

• The ith character p of the plaintext message is the character such that m(d,p) = e, where d is the character of the key corresponding to the ith character e of the encrypted message.

• Operationally,– Go to the dth row of the array– Find e in this row by scanning across– Record p, the column index of e

Page 90: Hypertext (1)

Polyalphabetic Substitution (7)

• Example: d = d, e = z, appears in column w, p = w

d = e, e = l, appears in column h, p = h

d = c, e = g, appears in column e, p = e

Page 91: Hypertext (1)
Page 92: Hypertext (1)

Text Compression

• Text is represented as a long string of binary digits, 8 digits per character.

• A 2000 word essay, has about – 10,000 characters – 2000 spaces– 96,000 bits

• Question: Can we represent this essay in substantially fewer bits?

Page 93: Hypertext (1)

Text Compression (2)

• Answer: Most likely, since we really only need 7 bits per character for the 94 printing characters plus white space characters.

Page 94: Hypertext (1)

Techniques

• Represent fixed text with a short symbol string, e.g.– Stock exchange symbols for company names– ISBN numbers for book title and author

• Shorter symbol strings for more frequently occurring text strings– Use one bit for the most frequent character, etc.

Page 95: Hypertext (1)

Techniques (2)

• Context dependent strings– Represent common combinations with their

own codes– Represent constant bit strings

Page 96: Hypertext (1)

Huffman Coding

• Frequency dependent coding

• Uses frequency distribution of characters in text– Most common occurring letter is E, 13.05%– Next most is T, 9.02%

– Rarest is Z, 0.09%

Page 97: Hypertext (1)

Huffman Coding (2)

• Creating a Huffman code for a set of characters– List the characters and their relative frequencies– Sort the list in order of least frequent to most

frequent– Build a coding tree, which is a binary tree, as

described below

Page 98: Hypertext (1)

Binary Tree

• A binary tree is a tree in which– Each interior node has degree 2– The child nodes are ordered

Page 99: Hypertext (1)

Huffman Coding (3)

• To build a Huffman tree– List the characters in order of frequency from most

to least– Make the two least frequent characters leaf nodes

and join them to a new node.– Label the new node with the sum of the

frequencies of the two child nodes– Label the link to the least frequent with 0 and the

other link with 1

Page 100: Hypertext (1)

Huffman Coding (4)

– Join the newly created node with the next least frequent character.

– Again add the frequencies, label the new node, and label the link to the least frequent node with 0, the other link with 1. Caution: compare the character frequency with the new node frequency

– Continue until all characters have been joined.– The last node (the root of the tree) will be

labeled with frequency 1.00. (Why?)

Page 101: Hypertext (1)

Huffman Coding (5)

To compress text with a Huffman code:

• Follow the tree from the root to the leaf labeled by a character to find the code of the character, the code being the sequence of link labels on the (unique) path to the character

Page 102: Hypertext (1)

Huffman Coding (6)

Example: Assume only 4 characters (so that the tree doesn’t get too large) with relative frequencies:

A = .40

B = .20

C = .15

D = .25

Total = 1.00

Page 103: Hypertext (1)

Huffman Coding (7)

Sort the characters by frequency, smallest firstC = .15

B = .20

D = .25

A = .40

Join B and C to get a node labeled .35 = .15 + .20

with link C .35 labeled 0 and B .35 labeled 1.

Page 104: Hypertext (1)

Huffman Coding (8)

Join the next least frequent character node (D) to the new node (.35) and create a node labeled .60 = .35 + .25

Label link D .60 with 0 and link .35 .60 with 1

Page 105: Hypertext (1)

Huffman Coding (9)

Join the next least frequent character node (A) to the new node (.60) and create a node labeled 1.00 = .60 + .40

Label link A 1.00 with 0 and link .60 1.00 with 1

Page 106: Hypertext (1)

Huffman Coding (10)

Follow the tree from the root to the leaves to find the codes:

A = 0

B = 111

C = 110

D = 10

Without compression, BAD takes 24 bits

With compression BAD = 111010, 6 bits