08 - information theory i - dipartimento di fisica e geologialuca.gammaitoni/fisen/...• according...
TRANSCRIPT
Communication• Communication is the transfer of information from one
place to another.
• This should be done
• as efficiently as possible
• with as much fidelity/reliability as possible
• as securely as possible
• Communication System: Components/subsystems act together to accomplish information transfer/exchange
Communication• Verbal Communication
• Spoken communication
• Languages and dialects
• Written Communication
• Symbols, hieroglyphics, and drawings
Communication• Smoke signals, telegraph,
telephone…
• 1895: invention of the radio by Marconi
• 1901: trans-atlantic communication
• …
• nowadays: everything communicate, Internet of things (IoT)
Communication system
• The message produced by a source must be converted by a transducer to a form suitable for the particular type of communication system.
• Example: In electrical communications, speech waves are converted by a microphone to voltage variation.
A Mathematical Theory of Communication - By C. E. SHANNON
Communication system
• The transmitter processes the input signal to produce a signal suits to the characteristics of the transmission channel.
• Signal processing for transmission almost always involves modulation and may also include coding. In addition to modulation, other functions performed by the transmitter are amplification, filtering and coupling the modulated signal to the channel.
A Mathematical Theory of Communication - By C. E. SHANNON
Communication system
• The receiver’s function is to extract the desired signal from the received signal at the channel output and to convert it to a form suitable for the output transducer.
• Other functions performed by the receiver: amplification (the received signal may be extremely weak), demodulation and filtering.
A Mathematical Theory of Communication - By C. E. SHANNON
Communication system
• The output transducer converts the electric signal at its input into the form desired by the system user.
• Example: Loudspeaker, personal computer (PC), tape recorders.
A Mathematical Theory of Communication - By C. E. SHANNON
Communication system
• The channel can have different forms: the atmosphere (or free space), coaxial cable, fiber optic, waveguide, etc…
• The signal undergoes some amount of degradation from noise, interference and distortion
A Mathematical Theory of Communication - By C. E. SHANNON
Communication systemWave length Frequency
DesignationsTransmissionMedia
PropagationModes
RepresentativeApplications
Frequency
1 cmExtra HighFrequency (EHF) 100 GHz
10 cmSuper HighFrequency (SHF)
Satellite,Microwave relay,Earth-satellite radar.
10 GHz
1 mUltra HighFrequency (UHF)
Wireless comm.service,Cellular, pagers, UHFTV
1 GHz
10mVery HighFrequency (VHF)
Mobile, Aeronautical,VHF TV and FM,mobile radio 100 MHz
100mHigh Frequency(HF)
Amateur radio, CivilDefense 10 MHz
1 kmMedium HighFrequency (MF)
AM broadcasting1 MHz
10 kmLow Frequency(LF) 100 kHz
100kmVery LowFrequency (VLF)
Wave guide
Coaxial Cable
Wire pairs
Line-of-sight radio
Sky wave radio
Ground waveradio
Aeronautical,Submarine cable,Navigation,Transoceanic radio
10 kHz
Line-of-sight (LOS) propagation
• Transmitting and receiving antennas must be within line of sight
• Examples: Satellite communication, Ground communication
Sky wave (ionospheric) propagation
• Signal reflected from ionized layer of atmosphere. Signal can travel a number of hops, back and forth
• Examples: Shortwave radio
Ground wave propagation
• Follows contour of the earth Can Propagate considerable distances
• Frequencies up to 2 MHz Example : AM radio
Signal• The information to be transmitted can be encoded
modulating amplitude (AM) or frequency (FM) of a signal
• According to Fourier analysis, any composite signal is a combination of simple sine waves with different frequencies, amplitudes, and phases
• The information transmission rate is limited by the transmitter, the medium and the receiver
Harry Nyquist• Determined that the number of
independent pulses that could be put through a telegraph channel per unit time is limited to twice the bandwidth of the channel
• Certain factors affecting telegraph speed (1924)
• Certain topics in Telegraph Transmission Theory (1928)
• This rule is essentially a dual of what is now known as the Nyquist–Shannon sampling theorem
Nyquist–Shannon sampling theorem
• From continuous signal to discrete signal
• Sampling is the process of converting a signal into a numeric sequence
• Applies to signals whose Fourier transform are zero outside of a finite region of frequencies
• The fidelity of the result depends on the sampling rate of the original samples
• No actual information is lost during the sampling process
Nyquist–Shannon sampling theorem
• If a function x(t) contains no frequencies higher than B cps (Hz), it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart
• A sufficient sample-rate is therefore 2B samples/second, or anything larger
• For a given sample rate fs the bandlimit for perfect reconstruction is B ≤ fs/2
• 2B: Nyquist rate
• fs/2: Nyquist frequency
Telegraphy• Telegraphy (from Greek: tele
"at a distance", and graphein "to write")
• Long distance transmission of textual/symbolic messages
• Method used for encoding the message be known to both sender and receiver
• Even e-mail is an example of telegraphy
Measuring information• s: symbol rate (number of symbol per second)
• n: number of states (binary, decimal, …)
• ns: possible messages per unit time
• The problem is to estimate the quantity of information relative to a message
Reduction to YES or NO answers
Alice Bob
HTTHHT is it T?Yes
NoHTTHHT
Transmission of 6 symbols requires 6 questions
Reduction to YES or NO answers
Alice Bob
ginger is it “c”?Yes
No
Inefficient!Maximum of 26 question, 13 on average (if characters outcome are i.i.d.)
Reduction to YES or NO answers
ABCDEFGHIJKLMNOPQRSTUVWXYZis it lesser than “N”?
ABCDEFGHIJKLMNOPQRSTUVWXYZis it lesser than “F”?
ABCDEFGHIJKLMNOPQRSTUVWXYZis it lesser than “J”?
ABCDEFGHIJKLMNOPQRSTUVWXYZis it lesser than “H”?
ABCDEFGHIJKLMNOPQRSTUVWXYZ
after maximum 5 questions we correctly individuate the character
Minimum number of questions
• 2# questions = 26 (for an alphabet character)
• # questions = log2(26) = 4.7 expected number of questions
• for a word composed by 6 character 6*4.7 = 28.2 questions needed
Reduction to YES or NO answers
• Rationale: reduce at each iteration the size off the set of one half
• Build a decision tree where the leafs of the tree are the available symbols
• Maximum number of questions equal to the height of the tree
Ralph Hartley• R. Hartley was an electronics
researcher
• Contributed to the foundations of information theory
• The hartley, a unit of information equal to one decimal digit, is named after him
Information source
• How is an information source to be described mathematically?
• How much information in bits per second is produced in a given source?
How is an information source to be described mathematically?
In telegraphy, for example, the messages to be transmitted consist of sequences of letters. These sequences, however, are not completely random. In general, they form sentences and have the statistical structure of, say, English. The letter E occurs more frequently than Q, the sequence TH more frequently than XP, etc. The existence of this structure allows one to make a saving in time (or channel capacity) by properly encoding the message sequences into signal sequences.
How is an information source to be described mathematically?
We can think of a discrete source as generating the message, symbol by symbol. It will choose successive symbols according to certain probabilities depending, in general, on preceding choices as well as the particular symbols in question. A physical system, or a mathematical model of a system which produces such a sequence of symbols governed by a set of probabilities, is known as a stochastic process.
Stochastic process which generates a sequences of symbolsUsing the same five letters (ABCDE) let the probabilities be .4, .1, .2, .2, .1, respectively, with successive choices independent. A typical message from this source is then:
• AAACDCBDCEAADADACEDA
• E A D C A B E D A D D C E C A A A A A D
Stochastic process which generates a sequences of symbolsA more complicated structure is obtained if successive symbols are not chosen independently but their probabilities depend on preceding letters.
In the simplest case of this type a choice depends only on the preceding letter and not on ones before that.
The statistical structure can then be described by a set of transition probabilities pi(j), the probability that letter i is followed by letter j
Choice, Uncertainty and Entropy
• We have represented a discrete information source as a Markoff process. Can we define a quantity which will measure, in some sense, how much information is “produced” by such a process, or better, at what rate information is produced?
• Suppose we have a set of possible events whose probabilities of occurrence are p1 ; p2 ;…; pn. These probabilities are known but that is all we know concerning which event will occur. Can we find a measure of how much “choice” is involved in the selection of the event or of how uncertain we are of the outcome?
Choice, Uncertainty and Entropy
Theorem 2:
where the constant K merely amounts to a choice of a unit of measure
Shannon entropy characteristics
• Continuity: the measure should be continuous, so that changing the values of the probabilities by a very small amount should only change the entropy by a small amount.
• Symmetry: the measure should be unchanged if the outcomes are re-ordered
Hn (p1, p2, . . .) = Hn (p2, p1, . . .)
Shannon entropy characteristics
• Additivity: the amount of entropy should be independent of how the process is regarded as being divided into partsif p1 and p2 are independent
Hn(p1, p2) = Hn(p1) +Hn(p2)
Shannon entropy characteristics
• Maximum: the measure should be maximal if all the outcomes are equally likely (uncertainty is highest when all possible events are equiprobable)For equiprobable events the entropy should increase with the number of outcomes
Hn
✓1
n, . . . ,
1
n| {z }n
◆= logb(n) < logb(n+ 1) = Hn+1
✓1
n+ 1
, . . . ,1
n+ 1| {z }n+1
◆
Hn(p1, . . . , pn) Hn
✓1
n, . . . ,
1
n
◆= logb(n)
Entropy in the case of two possibilities
• Entropy in the case of two possibilities with probabilities p and q = 1 - p
Choice, Uncertainty and Entropy
• Let’s suppose that all symbols are equiprobable and independent with probability pi=1/q (q symbols)
the entropy of a message can be written as
H = �K
pX
i=1
pi log pi
H = �KN
qX
i=1
✓1
q
◆log
✓1
q
◆
= �KNq
✓1
q
◆log
✓1
q
◆
= KN log(q)
Choice, Uncertainty and Entropy
• If the number of symbols is equal to 2 (binary system) and assuming K=1
the entropy of the message coincide with its length
H = KN log(q) = N log(2) = N
H = �KN
qX
i=1
✓1
q
◆log
✓1
q
◆
= �KNq
✓1
q
◆log
✓1
q
◆
= KN log(q)