communication theory
DESCRIPTION
Communication Theory. I. Frigyes 2009-10/II. http://docs.mht.bme.hu/~frigyes/hirkelm hirkelm01bEnglish. Topics. (0. Math. Introduction: Stochastic processes, Complex envelope) 1. Basics of decision and estimation theory 2. Transmission of digital signels over analog channels: noise effects - PowerPoint PPT PresentationTRANSCRIPT
Communication Theory
I. Frigyes
2009-10/II.
Frigyes: Hírkelm 2
http://docs.mht.bme.hu/~frigyes/hirkelmhirkelm01bEnglish
Frigyes: Hírkelm 3
Topics• (0. Math. Introduction: Stochastic processes, Complex
envelope)• 1. Basics of decision and estimation theory• 2. Transmission of digital signels over analog channels:
noise effects• 3. Transmission of digital signels over analog channels:
dispersion effects• 4. Analóg jelek átvitele – analóg modulációs eljárások (?)• 5. Channel characterization: wireless channels, optical
fibers • 6. A digitális jelfeldolgozás alapjai: mintavételezés, kvantálás, jelábrázolás • 7. Elvi határok az információközlésben.• 8. A kódelmélet alapjai • 9. Az átvitel hibáinak korrigálása: hibajavító kódolás; adaptív kiegyenlítés• 10. Spektrális hatékonyság – hatékony digitális átviteli eljárások
(0. Stochastic processes, the complex envelope)
Frigyes: Hírkelm 5
Stochastic processes
• Also called random waveforms.• 3 different meanings: • As a function of ξ number of realizations:
a series of infinite number of random variables ordered in time
• As a function of time t: a member of a time-function family of irregular variation
• As a function of ξ and t: one member of a family of time functions drawn at random
Frigyes: Hírkelm 6
Stochastic processes
• example:
t
ξ
f(t,ξ1)
f(t,ξ2)
f(t,ξ3)
f(t1,ξ) f(t2,ξ)f(t 3,ξ)
1
2
3
Frigyes: Hírkelm 7
Stochastic processes: how to characerize them?
• According to the third definition
• And with some probability distribution.
• As the number of random variables is infinite: with their joint distribution (or density)
• (not only infinite but continuum cardinality)
• Taking these into account:
Frigyes: Hírkelm 8
Stochastic processes: how to characerize them?
• (Say: density)• First prob. density of x(t)• second:
joint t1,t2 • nth:n-fold joint• The stochastic process is completly
characterized, if there is a rule to compose density of any order (even for n→).
• (We’ll see processes depending on 2 parameters)
ttxpx
21212,1 ,, tttxtxp xx nnnxx ttttxtxtxxp ,...,,....,,... 21212,1
Frigyes: Hírkelm 9
Stochastic processes: how to characerize them?
• Comment: although precisely the process (function of t and ξ) and one sample function (function of t belonging to say ξ16)
are distinguished we’ll not always make this distinction.
Frigyes: Hírkelm 10
Stochastic processes: how to characerize them?
• Example: semi-random binary signal:
• Values : ±1 (P0=P1= 0,5)
• Change: only at t=k×T• First density_:• Second::
12
11
2
1 xxpx
otherwise ,1,14
11,1
4
1
1,14
11,1
4
1 slot timesame in the , if
,1,12
11,1
2
1
,
2121
2121
21
2121
212,1
xxxx
xxxx
;tt
xxxx
txtxp xx
Frigyes: Hírkelm 11
Continuing the example:
In the same time-slot In two distinct time-slots
45o
45o
Frigyes: Hírkelm 12
Stochastic processes: the Gaussian process
• A stoch. proc. is Gaussian if its n-th density is that of an n-dimensional vector random variable
• m is the expected value vector, K the covariance matrix.
• nth density can be produced if are given
• are given
mXKmX
xK
X
1
2
1
2det
1 T
iep
nt
221121 E, és E tmtxtmtxttKtxtm xxxx
Frigyes: Hírkelm 13
Stochastic processes: the Gaussian process
• An interesting property of Gaussian processes (more precisely: of Gaussian variables):
• These can be realizations of one process at different times
yw
wwzyx
zEwxEyEzxE
zEyxEE
Frigyes: Hírkelm 14
Stochastic processes: stationary processes
• A process is stationary if it does not change (much) as time is passing
• E.g. the semirandom binary signal is (almost) like that
• Phone: to transmit 300-3400 Hz sufficient (always, for everybody). (What could we do if this didn’t hold?)
• etc.
Frigyes: Hírkelm 15
Stochastic processes: stationary processes
• Precise definitions: what is almost unchanged:• A process is stationary (in the strict sense) if for
the distribution function of any order and any at any time and time difference
• Is stationary in order n if the first n distributions are stationary
• E.g.: the seen example is first order stationary• In general: if stationary in order n also in any
order <n
,...,...,,...,..., 121121 nnxnnx ttttFttttF
Frigyes: Hírkelm 16
Stochastic processes: stationary processes
• Comment: to prove strict sense stationarity is difficult
• But: if a Gaussian process is second order stationary (i.e. in this case: if K(t1,t2) does not change if time is shifted) it is strict sense (i.e. any order) stationary. As: if we know K(t1,t2) nth density can be computed (any n)
Frigyes: Hírkelm 17
Stochastic processes: stationarity in wide sense
• Wide sense stationary: if the correlation function is unchanged if time is shifted (to be defined)
• A few definitions:.• a process is called a Hilbert-process if
• (That means: instantaneous power is finite.)
2E tx
Frigyes: Hírkelm 18
Stochastic processes: wide sense stationary processes
• (Auto)correlation function of a Hilbert-process:
• The process is wide sense stationary if
• the expected value is time-invariant and
• R depends only on τ=t2-t1 for any time and any τ.
2121 .E, txtxttR
Frigyes: Hírkelm 19
Stochastic processes: wide sense – strict sense stationary processes
• If a process is strict-sense stationary then also wide-sense
• If at least second order stationary: then also wide sense.
• I.e.:
21
2121
,,
,,E,
21212121
21212121
ttRdXdXXXpXX
dXdXXXpXXtxtxttR
ttttxxtt
ttttxxtt
RttRttR 1121 ,,
Frigyes: Hírkelm 20
Stochastic processes: wide sense – strict sense stationary processes
• Further: if wide sense stationary, not strict sense stationary in any sense
• Exception: Gaussian process. This: if wide sense stationary, also in stict sense.
Frigyes: Hírkelm 21
Stochastic processes: once again on binary transmission
• As seen: only first order stationary (Ex=0)
• Correlation:
• if t1 and t2 in the same time-slot:
• if in different:
1.E, 2121 txtxttR
0EE, 2121 txtxttR
Frigyes: Hírkelm 22
Stochastic processes: once again on binary transmission
• The semi-random binary transmission can be transformed in random by introducing a dummy random variable e distributed uniformly in (0,1)
• like x:
etxty
0E ty
eT
Frigyes: Hírkelm 23
Stochastic processes: once again on binary transmission
• Correlation:
• If |t1-t2|>T, (as e T)
• if |t1-t2| T
• so
0EEE 2121 et.ytytyty
otherwise0;
;1, 21
21
ttTettRy
T
TeRy
1E1
Frigyes: Hírkelm 24
Stochastic processes: once again on binary transmission
• I.e. :
-T Tτ
Frigyes: Hírkelm 25
Stochastic processes: other type of stationarity
• Given two processes, x and y, these are jointly stationary, if their joint distributions are alle invariant on any τ time shift.
• Thus a complex process is stationary in the strict sense if x and y are jointly stationary.
• A process is periodic (or ciklostat.) if distributions are invariant to kT time shift
tjytxtz
Frigyes: Hírkelm 26
Stochastic processes: other type of stationarity
• Cross-correlation:
• Two processes are jointly stationary in the wide sense if their cross correlation is invariant on any time shift
2121, E, tytxttR yx
12 tt
Frigyes: Hírkelm 27
Stochastic processes: comment on complex processes
• Appropriate definition of correlation for these:
• A complex process is stationary in the wide sense if both real and imaginary parts are wide sense stationary and they are that jointly as well
2121 .E, txtxttR
Frigyes: Hírkelm 28
Stochastic processes: continuity
• There are various definitions
• Mean square continuity
• That is valid if the correlation is continuous
ha ,E 22 txtx
Frigyes: Hírkelm 29
Stochastic processes: stochastic integral
• x(t) be a stoch. proc. Maybe that Rieman integral exists for all realizations:
• Then s is a random variable (RV). But if not, we can define an RV converging (e.g. mean square) to the integral-approximate sum:
b
a
dttxs
Frigyes: Hírkelm 30
Stochastic processes: stochastic integral
0Elimif 2
10
n
iii
t
b
a
ttxsdttxsi
• For this
b
a
b
as
b
a
dtdttxtxttR
dttxs
2121212 EE,
EE
Frigyes: Hírkelm 31
Stochastic processes: stochastic integral - comment
• In σs2 the integrand is the
(auto)covariancie-function:
• This depends only on t1-t2=τ if x is stationary (at least wide sense)
212121 EEEˆ, txtxtxtxttC
Frigyes: Hírkelm 32
Stochastic processes: time average
• Integral is needed – among others –to define time average
• Time average of a process is its DC component;• time average of its square is the mean power• definition:
T
TT
x dttxT
n2
1lim
Frigyes: Hírkelm 33
Stochastic processes: time average
• In general this is a random variable. It would be nice if this were the statistical average. This is really the case if
• Similarly we can define
0 and E 2n
txnx
T
TT
dttxtxT
R 2
1lim
Frigyes: Hírkelm 34
Stochastic processes: time average
• This is in general also a RV. But equal to the correlation if
• If these equalities hold the process is called ergodic
• The process is mean square ergodic if
0 with , 2 Rxx RR
2E2
1lim txdR
T
T
T
xT
Frigyes: Hírkelm 35
Stochastic processes: spectral density
• Spectral density of a process is, by definition the Fourier transform of the correlation function
deRS jxx
Frigyes: Hírkelm 36
Stochastic processes: spectral density
• A property:
• Consequently this integral >0; (we’ll see: S˙(ω)>0)
2
-
2
E0 :But
pairs-riableFourier va:, ;02
1
as 2
1E
txR
udu
dStx x
F
Frigyes: Hírkelm 37
Spectral density and linear transformation
• As known in time functions output function is convolution
• h(t): impulse response
FILTER
h(t)x(t) y(t)
duuthuxthtxty
Frigyes: Hírkelm 38
Spectral density and linear transformation
• Comment.: h(t<0)≡ 0; (why?); and: h(t) = F-1[H(ω)]
• It is plausible: the same for stochastic processes
• Based on that it can be shown :
• (And also )
xy SHS2
0EE Htxty
Frigyes: Hírkelm 39
Spectral density and linear transformation
• FurtherS(ω) ≥ 0 (all frequ.)• For: if not, there is a domain where
S(ω) <0 (ω1, ω2)
Sx(ω)
Sy(ω) (its integral is negative)
FILTER
h(t)x(t) y(t)
H(ω)
Frigyes: Hírkelm 40
Spectral density and linear transformation
• S(ω) is the spectral density (in rad/sec). As:
ω
H(ω)
Hzxx
xy
BSdS
HSS
22
1
Power
2
Frigyes: Hírkelm 41
Modulated signals – the complex envelope
• In previous studies we’ve seen that in radio, optical transmission
• one parameter is influenced (e.g. made proportional)
• of a sinusoidal carrier
• by the modulating signal .
• A general modulated signal:
tttAdtx c cos2
Frigyes: Hírkelm 42
Modulated signals – the complex envelope
• Here d(t) and/or (t) carries the information – e.g. are in linear relationship with the modulating signal
• An other description method (quadrature form):
• d, , a and q are real time functions – deterministic or realizations of a stoch. proc.
ttAqttAatx cc sincos
Frigyes: Hírkelm 43
Modulated signals – the complex envelope
• Their relationship:
• As known x(t) can also be written as:
ta
tqarctgt
tqtatd
;
2
22
ttdtqttdta sin2;cos2
tcjetjqtatx Re
Frigyes: Hírkelm 44
Modulated signals – the complex envelope
• Here a+jq is the complex envelope. Question: when, how to apply.
• To beguine with: Fourier transform of a real function is conjugate symmetric:
• But if so: X(ω>0) describes the signal completly: knowing that we can form the ω<0 part and, retransform.
XxtxX t-FF and ;ˆ
Frigyes: Hírkelm 45
Modulated signals – the complex envelope
• Thus instead of X(ω) we can take that:
• By the way:
• The relevant time function:
XjjXXXX signsignˆ
0;0
0;2
X
X
XjjtxXtx
sign11 FF
↓„Hilbert” filter
Frigyes: Hírkelm 46
Modulated signals – the complex envelope
• We can write:
• The shown inverse Fourier transform is 1/t.
• So
• Imaginary part is the so-callerd Hilbert-transzform of x(t)
sign-1 jtjxtxtx F
dt
xjtxtx
txjtxtxjtxtx ˆ H
Frigyes: Hírkelm 47
Modulated signals – the complex envelope
• Now introduced function is the analytic function assigned to x(t) (as it is an analytic function of the z=t+ju complex variable).
• An analytic function can be assigned to any (baseband or modulated) function; relationship between the time function and the analytic function is
tx
Frigyes: Hírkelm 48
Modulated signals – the complex envelope
• It is applicable to modulated signals: analytic signal of cosωct is ejωct. Similarly that of sinωct is jejωct. So if quadrature components of the modulated signal a(t), q(t) are
• band limited and
• their band limiting frequency is < ωc/2π (narrow band signal)
• then
tjc
tjc
c
c
etjqtxttqtx
etatxttatx
sin
or
cos
NB. Modulation is a linearoperation in a,q: frequencydisplacement.
Frigyes: Hírkelm 49
Modulated signals – the complex envelope
• Thus complex envelope determines uniquely the modulated signals. In the time domain
• Comment: according to its name can be complex. (X(ω) is not conjugate symmetric around ωc.)
• Comment 2: if the bandwidt B>fc, is not analytic, its real part does not define the modulated signal.)
• Comment 3: a és q can be independent signals (QAM) or can be related (FM or PM).
tjqtatx ~
ttqttatx
txetxtxtjqtatx
cc
tcj
sincosRe
~~
tx~
tx
Frigyes: Hírkelm 50
Modulated signals – the complex envelope
• In frequency domain? On analytic signal we saw.
ctj XXetxtx c
~ so and ~
X(ω)
X˚(ω)
X(ω)
ω
X(ω)
Frigyes: Hírkelm 51
H(ω)M(ω)
Modulated signals – the complex envelope
• Linear transfor-mation – bandpass filter –acts on x(t) as a lowpass filter.
• If H(ω) is asymm: x(t) is complex – i.e. is a crosstalk between a(t) és q(t) között
• (there was no sin component – now there is.)
X(ω)=F[m(t)cosωct]
X˚(ω)
X(ω)
Y(ω)
Y˚(ω)
Y(ω)
Frigyes: Hírkelm 52
Modulated signals – the complex envelope, stochastic processes
• Analytic signal and complex envelope are defined for deterministic signals
• It is possible for stochastic processes as well
• No detailed discussion
• One point:
Frigyes: Hírkelm 53
Modulált jelek – a komplex burkoló; sztochasztikus folyamatok
• x(t) is stationary (Rx is independent from t), iff qaaqqa RRRRtqta ; ;0EE
tRRtRR
RRRRtR
ttqttatx
cqaaqcqa
caqqacqax
cc
2sin2cos
sincos,
és sincos
Frigyes: Hírkelm 54
Narrow band (white) noise
• White noise is of course not narrow band
• Usually it can be made narrow band by a (fictive) bandpass filter:
ω
X(ω)
X(ω)
Sn(ω) =N0/2
H(ω)
Frigyes: Hírkelm 55
Narrow band (white) noise properties
csscsc RRRR ,,;
tcjsc
tcj etjntnetntn ~
sccn jRRR ,~
cncnn SSS ~~2
1
1. Basics of decision theory and estimation theory
Frigyes: Hírkelm 57
Detection-estimation problems in communications
• 1. Digital communication: one among signals known (by the receiver) – in presence of noise
• E.g.: (baseband binary communication)
• Decide: which was sent?
DIGITALSOURCE
Transmission channel
SINK
tntstnUts 01 ;
Frigyes: Hírkelm 58
Detection-estimation problems in communications
• 2. (Otherwise) known signal has unknown parameter(s) (statistics are known) -
• Same block schematic; example: non-coherent FSK
tntAtstntAts cc 222111 cos2 ;cos2
otherwise ;0
,;2
1
21
pp
Frigyes: Hírkelm 59
Detection-estimation problems in communications
• Other example: non-coherent FSK, over non-selective Rayleigh fading channel
222
2
epA
otherwise ;0
,;2
1
21
pp
tntAtstntAts cc 222111 cos2 ;cos2
Frigyes: Hírkelm 60
Detection-estimation problems in communications
• 3. Signalshape undergoes random changes• Example: antipodal signal set in very fast fading
tntTtststntTtsts 21 ;
DIGITALSOURCE
Transmission channel
T(t)SINK
Frigyes: Hírkelm 61
Detection-estimation problems in communications
• 4. Analog radio communications: one parameter of the carrier is proportional to the time-contous modulating signal.
• E.g..: analog FM; estimate: m(t)
• Or: digial transmission over frequency-selective fading. For decision: h(t) must be known (i.e. estimated
tntmFtAts c ..2cos2
Mitndsthtr i ,...2,1;
Frigyes: Hírkelm 62
Basics of decision theory
• Simplest example: simple binary transmission; decision is based on N independent samples
• Model:
SOURCE
H0
H1
CHANNEL
(Only statistics are known)
OBSEVATION SPACE
(OS)
DECIDER
Decision ruleH0? H1?
Ĥ
Comment: now ˆ has nothing todo with Hilbert transform
Frigyes: Hírkelm 63
Basics of decision theory
• Two hypothesis (H0 és H1)• Observation: N samples→the OS is of N-
dimensions• Observation: rT=(r1,r2…,rN)• Decision: which was sent• Results: 4 possibilities• 1. H0 sent & Ĥ=H0 (correct)• 2. H0 sent & Ĥ=H1 (erroneous)• 3. H1 sent & Ĥ=H1 (correct)• 4. H1 sent & Ĥ=H0 (erroneous)
Frigyes: Hírkelm 64
Bayes decision
• Bayes decision :• a.) probabilities of sending H0 or H1 are a-
priori known:
•• b.) each decision has some cost (Cik) (we decide
in favor of i while sent was k)• c.) it is sure: false decision is more expensive
than correct:
1100 ˆHPr ;ˆHPr PP
11010010 ; CCCC
Frigyes: Hírkelm 65
Bayes decision• d.) decision rule: the average cost (so
called risk, K) should be minimal
0101000000
1010111111
|ˆPr|ˆPr
|ˆPr|ˆPr
HHHPCHHHPC
HHHPCHHHPCK
OS
FORRÁS „H1”„H0”
(Z1)
(Z0)
pr|H1(R|H1)
pr|H0(R|H0)
domain of r; two pdf-s correspond to each point.
Frigyes: Hírkelm 66
Bayes decision• Question: how to partition OS in order to
get minimal K? • For that: K in detail:
• As some decision is taken:• And so
01
10
11|10111|111
00|01000|000
||
||
ZHr
ZHr
ZHr
ZHr
dHpPCdHpPC
dHpPCdHpPCK
RRRR
RRRR
110
ZZ
p
00011
1ZZZZZ
Frigyes: Hírkelm 67
Bayes decision
• From that:
• Term 1 & 2 are constant• And: both integrands >0• Thus: Z1, where the first integrand is larger
Z0, where the second
0
00|00100
0
11|11011111100
|
|
ZHr
ZHr
dHppCCP
dHppCCPCPCPK
RR
RR
FORRÁS„H1”
„H0”(Z1)
(Z0)
pr|H1(R|H1)
pr|H0(R|H0)
Frigyes: Hírkelm 68
Bayes decision
• And here we decide in favor of H1:
• decision: H0
1ˆ HH
00|00100
11|110111
|
|:
HppCCP
HppCCPZ
Hr
Hr
R
R
00|00100
11|110110
|
|:
HppCCP
HppCCPZ
Hr
Hr
R
R
0ˆ HH
Frigyes: Hírkelm 69
Bayes decision
• It can also be written: decide for H1 if
• Otherwise for H0
• Lefthand side: likelyhood ratio, Λ(R)• Righthand: (from certain aspect) the treshold, η• Comment: Λ depends only on the realisation of r
(on: what did we measure?)• η only on the a-priori probabilities and costs
11011
00100
00|
11|
|
|
CCP
CCP
Hp
Hp
Hr
Hr
R
R
Frigyes: Hírkelm 70
Example on Baysean decision
• H1: constant voltage + Gaussian noise• H0: Gaussian noise only (designation:
φ(r;mr,σ2) • Decision: on N independent samples of r• At sample #i
• This resulting in
20|
21| ,0; ;,; iHiriHir
RpmRp
N
iiHr
N
iiHr
RHp
mRHp
1
200|
1
211|
,0;|
;,;|
R
R
Frigyes: Hírkelm 71
Example on Baysean decision
N
i
i
N
i
i
R
mR
12
2
12
2
2exp
2
1
2exp
2
1
R
• its logaritm
• resulting in
2
2
12 2
ln
NmR
m N
ii
R
N
ii
N
ii
Nm
mRHH
Nm
mRHH
1
2
01
2
1 2ln:ˆ ;
2ln:ˆ
threshold
Frigyes: Hírkelm 72
Comments to the example
• 1. The threshold contains known quantities only, independent of the measured values
• 2. Result depends only on the sum of ri-s – we have to know only that; so called sufficient statistics l(R):
• 2.a Like in this example: OS dimension is whatever, l(R) is always 1D
• i.e.„1 coordinate” – the others are independent on hypothesis
N
iiRl
1
R
Frigyes: Hírkelm 73
SOURCE
H0
H1
CHANNEL
(Only statistics known)
OBSERVATIONSPACE
(OS)
Decision rule
DECISION SPACE
(DS)
Rl
DECIDER
Ĥ
Thus the decision process
Frigyes: Hírkelm 74
Comments to the example
• 3. Special case: C00=C11=0 és C01=C10=1
• (i.e. probability of erroneous decision)
• If P0,1≡0,5, the treshold N.m/2
0
11|1
1
00|0 ||Z
HZ
H dHpPdHpPK RRRR rr
Frigyes: Hírkelm 75
An other example, for home
• Similar but now the signal is not constant but Gaussian noise with variance σS
2
• I.e. H1:Π φ(Ri;0,σS2+σ2)
• H0:Π φ(Ri;0,σ2)
• Questions: threshold; sufficient statistics
Frigyes: Hírkelm 76
Third example - discrete
• Given two Poisson sources with different expected values; which was sent?
• Remember: Poisson-distribution:
• Hypotheses:
!Pr
n
emn
nn
!
Pr: ;!
Pr:0
00
11
1 n
emnH
n
emnH
mnmn
Frigyes: Hírkelm 77
Third example - discrete
• Likelihood-ratio:
• Decision rule: (m1>m0)
• For sake of precision:
01
0
1 mm
n
em
mn
001
01
101
01
:ln
ln if
;:ln
ln if
Hmm
mmn
Hmm
mmn
5,0)1Pr()0Pr(
;1:ln
ln if 01
01
01
QHQHmm
mmn
Frigyes: Hírkelm 78
Comment
• A possible situation: a-priori probebilities are not known
• A possible method then: compute maximal K (as a function of Pi;and chose the decision rule wich
minimizes that (so called minimax decision)
• (Note: this is not optimal at any Pi )• But we don’t deal with this in detail.
Frigyes: Hírkelm 79
Probability of erroneous decision
• For that: compute relevant integral• In example 1 (with N=1): Gs pdf should be
integrated over the hatched domains
Threshold:
2ln0
m
md
d0 d1
Frigyes: Hírkelm 80
Probability of erroneous decision
• Thus:
• If lnη=0: d0=d1=m/2 (threshold: the point of intersection)
• Comment: N samples:
0
0
d
02
d
02
10E
dmerfc
2
1dR,m;R1|0P
derfc
2
1dR,0;R0|1P
1|0PP0|1PPP
2ln0
Nm
Nmd
Frigyes: Hírkelm 81
Decision with more than 2 hypotheses
• M possible outcomes (e.g.:non-binary digital communication – we’ll see: why for)
• Like before: each decision has a cost
• Their average is the risk
• With Bayes-decision: this is minimized
• Like before: Observation Space
• decision rule: partitioning of the OS
Frigyes: Hírkelm 82
Decision with more than 2 hypotheses
• Like before, risk:
• From that (with M = 3)
RRR dHpPCKM
i
M
j ZjHjjij
i
1
0
1
0| |ˆ
2
00|0020011|11211
1
00|0010022|22122
0
11|1101122|22022
222111000
||
||
||
ZHH
ZHH
ZHH
dHpCCPHpCCP
dHpCCPHpCCP
dHpCCPHpCCP
CPCPCPK
RRR
RRR
RRR
rr
rr
rr
Frigyes: Hírkelm 83
Döntés kettőnél több hipotézisnél
• Likelyhood ratio-series :
• Decision rule(s):
00|
22|2
00|
11|1 |
| ;
|
|
Hp
Hp
Hp
Hp
H
H
H
H
R
RR
R
RR
r
r
r
r
less; if or ˆ
:or ˆ
less; if or ˆ
:or ˆ
less; if or ˆ
:or ˆ
01
1112111020022212212
10
1012110020022202212
20
2021220010011101121
HHH
CCPCCPCCPHHH
HHH
CCPCCPCCPHHH
HHH
CCPCCPCCPHHH
RR
RR
RR
Frigyes: Hírkelm 84
Döntés kettőnél több hipotézisnél (M =3)
• This defines 3 streight lines (in the 2D decision space)
H0
H1
H2
Λ1(R)
Λ2(R)
Frigyes: Hírkelm 85
Ecample: special case – error probability
• The average error probability is minimized
• Then we get:
1;0 jiii CC
H2
H0
H1
Λ1(R)
Λ2(R)
P0 /P2
P0 /P1
Λ2 = (P1/P2)Λ1.
Frigyes: Hírkelm 86
The previous, detailed
kisebb; ha vagy ˆ
: vagy ˆ
kisebb; ha vagy ˆ
: vagy ˆ
kisebb; ha vagy ˆ
: vagy ˆ
01
112212
10
02212
20
01121
HHH
PPHHH
HHH
PPHHH
HHH
PPHHH
RR
R
R1;0 jiii CC
kisebb; ha vagy ˆ
: vagy ˆ
kisebb; ha vagy ˆ
: vagy ˆ
kisebb; ha vagy ˆ
: vagy ˆ
01
1112111020022212212
10
1012110020022202212
20
2021220010011101121
HHH
CCPCCPCCPHHH
HHH
CCPCCPCCPHHH
HHH
CCPCCPCCPHHH
RR
RR
RR
H2
H0
H1
Λ1(R)
Λ2(R)
P0 /P2
P0 /P1
Λ2 = (P1/P2)Λ1.
Frigyes: Hírkelm 87
Example –special case, error probability: a-posteriori prob.
• Very important, based on the precedings: we have
• If we divide each by pr(R) and apply Bayes theorem we get:
11|122|202
00|022|212
00|011|121
||:H vagy
||:H vagy
||:H vagy
HpPHpPH
HpPHpPH
HpPHpPH
HH
HH
HH
RR
RR
RR
rr
rr
rr
RRRRRR |Pr|Pr ;|Pr|Pr;|Pr|Pr 120201 HHHHHH
b
aabba
Pr
Pr|Pr|Pr: theoremBayes
(these are a-posteriori probabilities)
Frigyes: Hírkelm 88
Example –special case, error probability: a-posteriori prob.
• I.e. we have to decide on the max. a-posteriori probabilities.
• Rather plausible: probability of correct decision is the highest if we decide on what is the most probable
Frigyes: Hírkelm 89
Bayes- theorem (conditional probabilities)
• For discrete variables:
• Continuous:
• a discrete, b continuous:
b
aabba
a
baab
b
baba
Pr
Pr.|Pr|Pr
Pr
,Prˆ|Pr;
Pr
,Prˆ|Pr
bp
apabpbap
.||
bp
aabpba
Pr.||Pr
Frigyes: Hírkelm 90
Comments
• 1. The Observation Space is N dimensional (N is number of the observations).
• The Decision Space is M-1 dimensional (M is number of the hypotheses).
• 2. Explicitly we dealt only with independent samples; investigation is much more complicated is these are correlated
• 3. We’ll see that in digital transmission the case N>1 is often not very important
Frigyes: Hírkelm 91
Basics of estimation theory: parameter-estimation
• Task is to estimate unknown parameter(s) of an analog or digital signal
• Examples: voltage measurment in noise
• digital signal; phase measurement
)1,...1,0(
;2
cos
Mi
iM
ttats cii
nsrsx ;
Frigyes: Hírkelm 92
Basics of estimation theory: parameter-estimation
• Frequency estimation
• synchronizing an unaccurate oscillator
• Power estimation of interfering signals
• interference cancellation (via the antenna or multiuser detection)
• SNR estimation
• etc
Frigyes: Hírkelm 93
Basics of estimation theory: parameter-estimation
• The parameter can be: • i. a random variable (pdf is assumed to be known)
or• ii. an unknown deterministic value• Model:
PARAMETERSPACE
ESTIMATIONSPACE
a
Domain ofthe estimated
parameter
Becslésiszabály
Ra
pa(A)
OBSERVATIONSPACE (OS)
Mapping toOS
ii. means: we have no a-priori knowledge about its magnitude
i. means: we havesome a-priori knowledge
Frigyes: Hírkelm 94
Example – details (estimation 1)
• We want to measure voltage a
• We know that • And that Gaussian
noise is added φ(r;0,σn
2)• Observable
parameter is r = a+n• Mapping of the
parameter to OS:
2
2
|2
exp2
1|
nn
ar
ARARp
Va
Frigyes: Hírkelm 95
Parameter estimation –parameter is a RV
• Similar principle: estimation ha some cost; its average is the risk; we want to minimize that
• realization of the parameter : a• Observation vector: R• Estimated value: â(R)• Cost is in the general case a 2-variable function:
C(a,â)
• Error of the estimation is ε = a-â(R)
• Often the cost depnds only on that: C = C (ε)
Frigyes: Hírkelm 96
Parameter estimation –parameter is a RV
• Examples:
• Risk is
• Joint pdf can be written:
2/;1
2/;0 ; ;2
CCC
dAdApaACCK a RRR r ,ˆE ,
RRR Rrr |, |, AppAp aa
Frigyes: Hírkelm 97
Parameter estimation –parameter is a RV
• Applying to the square cost function (subscript ms: mean square)
• K=min (i.e. )where the inner integral = min (as the outer is i. pozitiv and ii. does not depend on A); this holds where
RRRR r ddAApaApK ams |ˆ |2
0ˆ/ addK
Frigyes: Hírkelm 98
Parameter estimation –parameter is a RV
0|ˆ2|2
|ˆˆ
||
|2
dAApadAAAp
dAApaAad
d
aa
a
RRR
RR
rr
r
• The second integral =1, thus
• i.e. a-posteriori expected value of a.• (According to the previous definition : a-posteriori knowledge: what
is gained from measurement/investigation.)
dAAApa ams RR r |ˆ |
Frigyes: Hírkelm 99
Comment
• Comming back to the risk:
• Inner integral is now: the conditional variance, σa
2(R). Thus
•
dAApaAdpK arms RRRR r |ˆ |2
22 E aams dpK RRRr
Frigyes: Hírkelm 100
Parameter estimation –parameter is a RV
• An other cost function: 0 in a band Δ elsewhere 1. The risk is now
• (un: uniform)• K=min, if result of the estimation is
maximum of the conditional pdf (if Δ is small): max. a-posteriori – MAP-estimation
2ˆ
2ˆ
| |1R
R
rr RRRa
a
aun dAApdpK
Δ
Frigyes: Hírkelm 101
Parameter estimation –parameter is a RV
• Than derivative of the log conditional pdf=0
• (so called MAP-equation)
• Applying Bayes-theorem log-cond. pdf can be written
0|ln
ˆ
|
R
r R
aA
a
A
Ap
RRR rrr pApApAp aaa lnln|ln|ln ||
R
RR
r
rr p
ApApAp aa
a
.|| |
|
Frigyes: Hírkelm 102
Parameter estimation –parameter is a RV
• First term is the statistical relationship between A-R
• Secnd is the a-priori knowledge • Last term does not depend on A.• Thus what is maximal is
• And so the MAP equation is:
0
ln|ln:ahol ,ˆ |
A
Ap
A
ApaA aa RR r
ApApAl aa ln|lnˆ | Rr
Frigyes: Hírkelm 103
Once again
• Minimum Mean Square Error (MMSE) estimate is the average of the a-posteriori. pdf
• Maximum a-posteriori (MAP) estimate is the maximum of the a-posteriori. pdf
Frigyes: Hírkelm 104
Example (estimate-2)
• Again we have Gauss ian a+n, but N independent samples
• What we need to (any) estimate is
N
i a
i
n
a
aa
a
ARAp
AAp
12
2
|
2
2
2exp
2
1|
2exp
2
1)(
Rr
R
RR
r
rr p
ApApAp aa
a
|| |
|
Frigyes: Hírkelm 105
Example (estimate-2)
• Note that p(R) is constant from the point of view of the conditional pdf, thus its form is irrelevant.
• Thus
2
2
21
2
1
1| 2
1exp
22
1|
an
N
ii
N
ina
a
AAR
kAp
RRr
Frigyes: Hírkelm 106
Example (estimate-2)
• This is a Gaussian distribution and we need its expected value. For that the square must be completed in the exponent, resulting in
2
122
2
22
2|
1
2
1exp|
N
ii
na
aa R
NNAkAp
RRr
22
222
2with na
na
N
Frigyes: Hírkelm 107
Example (estimate-2)
• In Gaussian pdf average = mode thus
N
ii
na
amapms R
NNaa
122
2 1
/ˆˆ
RR
Frigyes: Hírkelm 108
Example (estimate-3)
• a is now also φ(A;0,σn2), but only s(A),
a nonlinear function of it can be observed (e.g. phase of a carrier); noise is added:
• Az a-posteriori sűrűségfüggvény
2
2
21
2
| 2
1exp|
an
N
ii
a
AAsR
RkAp
Rr
ii nAsr
Frigyes: Hírkelm 109
Példa (becslés-3)
• Remember: the MAP equation:
• Aplying that to the preceding
0
ln|ln:ahol ,ˆ |
A
Ap
A
ApaA aa RR r
A
AsAsRa
N
ii
n
aMAP
12
2
ˆ
R
Frigyes: Hírkelm 110
Parameter estimation – the parameter is a real constant
• In that case it is the measurement result what is a RV. E.g. in the case of a square cost function the risk is:
• This is minimal if â(R)=A. But this has no sense: that’s the value what we want to estimate.
• In that case – i.e. if we have no a-priori knowledge – the method can be:
• we search for a function of A – an estimator – which is „good” (has average close to A and low variance)
RRR r dApAaAK a |ˆ |2
Frigyes: Hírkelm 111
Közbevetve: a tételek eddig
• 1. Sztohasztikus folyamatok: főként a fogalmak definiciója
• (sztoh. foly.; val. sűrűségek-eloszlások, erős stacioanaritás; korrelációs fv. gyenge stacionaritás, időátlag, ergodicitás spektrális sűrűség, lin. transzformáció)
• 2. Modulált jelek – komplex burkoló• (mod. jelek leírása, időfüggv., analitikus függv.
(frekv-idő) komplex burkoló, egyikből a másik, szűrő)
Frigyes: Hírkelm 112
Közbevetve: a tételek eddig/2
• 3. A döntéselmélet alapjai• (Milyen feladatok; költség, kockázat, a-priori val,
Bayes-f. döntés; a megfigyelési tér opt. felosztása, küszöb, elégséges statisztika, M hipotézis (nem kell rész-letezni, csak az eredmény), a-post. val.)
• 4. A becsléselmélet alapjai• (Val.-vált paraméter, költségfüggv. min. ms.
max. a-post.; determ. param, likelyhood fv. ML, MVU becslés, torzítatlan; Cramér-Rao; hatékony)
Frigyes: Hírkelm 113
Parameter estimation – the parameter is a real constant – criteria for the estimator
• Average of the – somehow chosen – estimator:
• If this is = A: unbiased estimate; operation of estimation is searching of the average
• If bias B is constant: can be subtracted from the average (e.g. in optics: background radiation)
• if B=f(A): biased estimate
)A(BA
BA
A
RdA|RpRaˆRaE A|r
Frigyes: Hírkelm 114
Parameter estimation – the parameter is a real constant – criteria for the estimator
• Error variance:
• Best would be: unbiased and• low variance• more precisely: unbiased and minimális variance• for any A (MVU)• Such estimator does or does not exist• An often applied estimator: maximum likelihood• Likelihood function.: as a function of A• Estimator of A then: maximum of the likelihood function
ABAaAa222
ˆ2 ˆE RR
Ap a || Rr
Frigyes: Hírkelm 115
A few of details
This is certainly betterthan that:
variance of themeasuring result is lower Θ
Further:
Frigyes: Hírkelm 116
Max. likelihood (ML) estimation
• Necessary condition of max of ln likelihood
• Remember: in the case of a RV: MAP estimate
• Thus: ML is the same without a-priori knowledge
0
|ln:ahol ,ˆ |
A
ApaA a RR r
0
ln|ln:ahol ,ˆ |
A
Ap
A
ApaA aa RR r
Frigyes: Hírkelm 117
Max. likelihood (ML) estimation
• For any unbiased estimate: variance cannot be lower than Cramér-Rao lower bound, CRLB,
• or an other form:
12
|ˆ
2 |lnE
A
Ap aAa
RrR
1
2
|2
ˆ2 |ln
E-
A
Ap aAa
RrR
Frigyes: Hírkelm 118
Max. likelihood (ML) estimation
• Proof of CRLB: via Schwartz- unequality
• If the estimate is equal to CRLB: efficient estimate.
Example of existing MVUbut no (or unknown)
efficient estimate(var θ^
1>CRLB)
Frigyes: Hírkelm 119
Example 1 (estimation, nonrandom)
• Voltage + noise, but the voltage is a real, nonrandom
• Max likelihood estimation
N
iiML
N
ii
n
A
nnii
RN
a
ARN
N
A
p
nnpNinAr
1
12
|
2
1ˆ
01ln
,0;)(;,...2,1;
R
r
Frigyes: Hírkelm 120
Example 1 (estimation, nonrandom)
• Is it biased??
• Expected value of â is the (a-priori not known) true value; i.e. it is unbiased
ANAN
RN
aN
iiML
1E
1ˆE
1
R
Frigyes: Hírkelm 121
Example 2 (estimation, nonrandom): phase of a sinusoid
• What can be measured is a function of the quantity to be estimated
• (Independent Samples are taken at equal times)
• A likelyhood function to be maximized:
Frigyes: Hírkelm 122
Example 2 (estimation, nonrandom): phase of a sinusoid
• This is = max, if the function below = min
•
=0
• The righthand side tends to 0, with large N
Frigyes: Hírkelm 123
Example 2 (estimation, nonrandom): phase of a sinusoid
• And then finally:
2. Transmission of digital signals over analog channels: effect of
noise
Frigyes: Hírkelm 125
Introductory comments
• Theory of digital transmission is (at least partly) application of decision theory
• Definition of digital signals/transmission:• Finite number of signal shapes (M)• Each has the same finite duration (T)• The receiver knows (a priori) the
signal shapes (they are stored)• So the task of the receiver is hypothesis
testing.
Frigyes: Hírkelm 126
Introductory comments – degrading effects
DECISIONMAKER
BANDPASSFILTER
BANDPASSFILTER
FADINGCHANNEL +
n(t)
NONLINEARAMPLIFIER
s(t)
INTER-FERENCE
INTER-FERENCE
INTER-FERENCE
ωc
ωcz0(t)
z1(t)
z2(t)
ω1
ω2
CCI
ACI
ACI
Frigyes: Hírkelm 127
Introductory comments
• Quality parameter: error probability• (I.e. the costs are:• )• Erroneous decision may be caused by:• additíve noise• linear distortion• nonlinear distortion• additive interference (CCI, ACI)• false knowlledge of a
parameter• e.g. synchronizing error
MkiCC kiii ,...,2,1, ;1 ;0
Frigyes: Hírkelm 128
Introductory comments
• Often it is not one signal of which the error probability is of interest but of a group of signals – e.g. of a frame.
• (A secondquality parameter: erroneous recognition of T : the jitter.)
Frigyes: Hírkelm 129
Transmission of single signals in additive Gaussian noise
• Among the many sources of error now we regard only this one
• Model to be investigated:
SOURCESIGNAL
GENERATOR +DECISION
MAKERSINK
TIMING (T)
n(t)
mi
{mi}, Pi
si(t) r(t)= si(t)+n(t)
ˆm
Frigyes: Hírkelm 130
Transmission of single signals in additive Gaussian noise
• Specifications:• a-priori probabilities Pi are known• support of the real time finctions is• (0,T)• their energy is finite (E: square
integral of the time functions)• relationship is mutual and
unique (i.e. there is no error in the transmitter)
tsi
tsm ii