athens - 11 may 2004 statistics and probability in geosciences: fernando sansó diiar - politecnico...
TRANSCRIPT
Athens - 11 May 2004
Statistics and Probability Statistics and Probability in Geosciences:in Geosciences:
Fernando Sansó
DIIAR - Politecnico di Milano – Polo Regionale di Como
from Least Squares from Least Squares to Random Fieldsto Random Fields
Athens - 11 May 2004
Statistics of GeosciencesStatistics of Geosciences
From the availability of new electronic HW, new impulses to model theory and field theory
JeffreysBaardaMoritzKrarup
Tarantola-ValletteGeman-Geman
LaplaceLegendre
GaussPoissonMarkov
Statistics drifts away from its origins, so much entangled with geo-sciences (and astronomy)
Athens - 11 May 2004
recognizing that Statistics is the art and science
of ambiguous knowledge,
I claim that the “whole probabilistic”,
Bayesian, point of view can be taken as a
unifying foundation for all spatial
information sciences.
Athens - 11 May 2004
Scientific concepts are born of an abstraction process, namely
when we observe natural phenomena, after eliminatinga multitude of tiny details, we can grasp the common and regular elements on which axioms rules and laws can be built.
(From Plato’s ideas () to Euclede’s () elements).
Athens - 11 May 2004
Examples
• a straight line; who has ever seen one?
• a Eucledean triangle: measuring its angles at the astronomical level has become one of the means to decide about the curvature of the universe;
• the Galilean equivalence principle, which emerges from physical experiments only by abstracting from friction, by assuming a constant gravity etc.
Athens - 11 May 2004
Psychologically one can understand why at the beginning of modern science the incertitude was considered as the enemy, classified as “measurement error”.
This is how modern statistics was born from the very beginning as “error theory”, based on a probabilistic interpretation via the central theorem and used in an inferential approach, to produce “best” estimates of the parameters of interest, according to a proto-maximum likelihood criterion.
Athens - 11 May 2004
Historical examples
• The astronomical measurement of Jupiter diameters to test the hypothesis that its figure was ellipsoidal and rotating around the minor axis.
• The geodetic measurements of arcs of meridians, performed by the French Academy in France, in Lapland and on the Andes, to measure the eccentricity of the Earth.
Athens - 11 May 2004
One fundamental step in the development of the understanding of statistics has been the clear establishment of the so called Gauss-Markov linear standard model, with all its developments in least squares theory: this is understood by explaining what are
• the deterministic model,
• the stochastic model.
Athens - 11 May 2004
The deterministic model in Gauss-Markov theory
(discrete and finite-dimensional):
every experiment can be described by n+m variables organized in two vectors
• measurable quantities
• parameters;
these variables are deterministic, i.e. in principle they can be given a fixed numerical value in the experiment analysed, and they are related by geometric and physical laws.
)(dimmy
)(dimmx
Athens - 11 May 2004
General mathematical form of the physics of the experiment
From the observations themselves or from prior knowledge we have approximate values and we put
after linearization we have
and we assume to be able to solve for .In the end we have a linear model of observation equations
.0),( yxg
yx ~,~
yyxx ~,~
0)~,~(~,~~,~
yxyx yg
xg
yxg
.axAy
Athens - 11 May 2004
Once linearized, the deterministic model has the meaning that y cannot wander over the whole m, but it is constrained to
][Range, AaxaxAyV n
Range [A]
Va
Athens - 11 May 2004
The stochastic Model (reduced to pure 2nd order information)
We assume now we observe the vector y , so that
we draw a vector Y0 from an m-dimensional
variate: )errors""(~0 yYY
yYEE }{0}{
CCQQC YY )known(20
L.S. problem find on V, somehow related to Y0 y
y
0YV
Athens - 11 May 2004
L.S. Principle
Let be given by:
Justification (Markov theorem)
Among all linear unbiased estimators
2
0ˆ
1ˆMinargˆQ
VyyYyy
hYHy ~ kYKxaxAyVy ~,~~ i.e.,~(
)0, akAIAK
yYQyYVy
ˆˆMinarg 01
0ˆ
yyeyye ˆˆ,~~putting
we have .~~ˆˆ eeee CC
Athens - 11 May 2004
By L.S. theory, complemented by suitable numerical techniques, several very large geodetic problems have been solved:
adjustment of large geodetic networks (N.A. datum ~ 40.000 parameters ~ 1980)
• satellite orbit control (from 1970)
• analytic photogrammetry
• discrete finite models of the gravity field (e.g. by buried masses or by truncated spherical harmonics expansion:
),(,
1
mm
m
YTrR
RT
Athens - 11 May 2004
From L.S. theory new problems have evolved:
• testing theory as applied to Correctness of the model (2 test on ) Values of the parameters (significance of input factors in linear regression analysis)
Outliers identification and rejection (Baarda’s data snooping) and the natural evolution towards robust estimators (L1-estimators etc.)
20
Athens - 11 May 2004
Mixed models with two types of parameters x (continuous), b (integers) like with GPS observations where b are initial phase ambiguities.
Note: the numerical complexity if we adopt a simple trial and error strategy for b; if we have a base line with 10 visible satellites and for each double difference we want to try 3 values, we have to perform
39 ~ 20.000adjustments.
Athens - 11 May 2004
Variance elements estimation or random effects model
When we don’t know C but we have a model
this corresponds to the following stochastic model
when i are basically non observable (or hidden)
random parameters.
;?, 2
1
2i
q
iii QC
q
iiiAxAY
1
ijijiiEE 2}{,0}{
,iii QAA
Athens - 11 May 2004
Examples (ITRF 2005)
We estimate 3N coordinated of Earth stations x1, x2, …,
xN by different spatial techniques (e.g. GPS, LR, etc.).
Each technique has a vector of adjusted coordinates in its own reference frame
where, with respect to a unified reference system,
Note: due to imperfect modelling, one can assume that the estimate of is unrealistic.
20k
kk
kN
k
kN
k
k x
x
x
Y
11
N
kkk
x
x
TxTx 1
)()( and kkk NCC kk20
Athens - 11 May 2004
If we called all the equations we get
we get
and in the next ITRF, the IERS is going to estimate together with
N
tt x
x
T
Y
Y
Y 1
1
1
),...,(
t
kkC
1
20
0
0
0
0
0
0kN
.1
Nx
x
x
20k
Athens - 11 May 2004
The Bayesian revolution
• Probability is an axiomatic index measuring the subjective state of knowledge of a certain system,
• every system is thus described by a number of random variables through their joint distribution,
• every observation modifies the state of knowledge of the system, namely the distribution of the relevant variables, through the operation of probabilistic conditioning.
Athens - 11 May 2004
According to this vision, the physical laws are
only verified in the mean, when we average on a
population of effects that cannot be controlled,
and which can be described only by a probability
distribution, expressing our prior knowledge of
the phenomenon.
(De Finetti)
Athens - 11 May 2004
Linear Bayesian Models
We start from observation equations
where all variables are random; X, N are the primitive variables of the observation process and we assume them to be independent and described by some prior distribution
Y, the variable we sample by observations, is a derived variable with joint prior
NXAY
;)()(),( 000 NXNX pxpxp
.)()()()|(),( 000|00 xpxAypxpxypyxp XNXXYYX
Athens - 11 May 2004
According to Bayes theorem the observation Y0 enters
to condition the X distribution, namely
(Posterior)
Example (random networks): we measure two
distances of P from known P1 and P2
dpAYp
xpxAYpYxp
XN
XNYX
)()(
)()()|(
000
0000|
PP1
P2
D1
D2
Prior of P
Athens - 11 May 2004
P1
P2
D1 P1
P2
D1
P1
P2
D2
P1
P2
D1
D2
We measure
D1
Effect of D1
We measure
D2
Effect of D1 and D2
Posteriorof P
Athens - 11 May 2004
A general Bayesian network is a network where
points are random and measurements change
their distribution; in a sense (apart from
Heisenberg’s principle) there is a striking similarity
with quantum theory.
Let us now restrict the Bayes concept to the linear
regression case.
Athens - 11 May 2004
,NXAY LNXLAYLX ˆ
NLXILAXXe )(ˆˆ
LLCILACILACE NXee00
ˆˆ2 TrTr
10002 AAMin NXX CACCLE
The solution is then written as
Where we see that it is a combination of the observations with the prior knowledge
YCACACAYCAACACX NXNNXX
111 01
001000ˆ
.}{,0}{ 000XCXXEXE
Athens - 11 May 2004
Note: now we don’t have any more a rank deficiency because > 0 for sure, so that we can have n > m and and even
0NC
n = + ,
i.e. X is in reality a random field:
n nn PXPu )()(
)}({ PuLY ii
)}({0
PLX nin
n
functionals of u
Athens - 11 May 2004
Examples
• Cartography: u(P) is DEM
• Image analysis: u(P) is density of flux through a sensor element
• Physical Geodesy: u(P) is the anomalous Earth potential
)()]([ PuPuL (point height)
dSPuPuL Pixel
)()]([
urr
uPuL
2)]([
(gravity anomaly)
Athens - 11 May 2004
Important remark: it is easy to prove that controls the prior regularity of the field.
elevations profile
0XC
image profile
gravity profile
can be considered as a hyperparameter and estimated through an infinite dimensional calculus (Malliavin Calculus).
Here statistics is fused with functional analysis to properly define the space of estimators.
0XC
Athens - 11 May 2004
Conclusions
If modern statistics, which was born at the
beginning together with geodesy and astronomy
to treat measurement errors, has slowly drifted
away to become mate of mechanics, then of radio-
signals analysis and finally of economic sciences,
nowadays we are entitled to say that Earth
sciences with their need of estimating spatial
fields, are giving to statistics a serious scientific
contribution, pushing it along the road of modern
probability theory and functional analysis.