local regression estimators: a forgotten heritagelos alamos hydrodynamic methods10/4/01 - 3...

10/4/01 - 1Los Alamos Hydrodynamic Methods

Local Regression Estimators:A Forgotten Heritage

Gary DiltsHydrodynamic Methods Group

Los Alamos National Laboratory

Aamer HaqueGeorge Mason University

John WallinGeorge Mason University

LAUR 00-2552


Background

✜ Compressible flow method for coupled solid-fluid systemsLarge energy transfer (shocks)

Large deformation

Strong shear

Void opening and closure

Complex material histories

✜ Meshes (Lagrangian or Eulerian) are problematic

✜ SPH appealing candidate, but numerical difficulties

✜ Inspired by EFG, tried to fix with Galerkin + MLS

✜ Much progress, but more difficultiesArtificial viscosity: usual tricks don’t work

Anti-symmetric complete quadrature schemes neededCan’t seem to do without inverting global matrix

✜ MLS particles must be met on their own terms: mesh ideas don’tseem to pan out. What to do?


Like Balboa, wesighted a new ocean.

1996

Like the Pacific, it was notentirely unknown.


Our Statistical Heritage

✜ Smoothing kernels originally conceived by Rosenblatt (1956) as ameans of generalizing the histogram

✜ Lucy’s SPH (1977)Motivated by Monte Carlo theory

✜ Gingold + Monaghan’s SPH (1977)SPH originally conceived as probabilistic method

Position of fluid points randomly distributed by mass density

This density is given by the statistical smoothing kernel method

Monte Carlo efficiency suggested SPH would be efficient

✜ Stone (1977) conceived local linear regression as a means togeneralize kernel density estimators to achieve “consistent”nonparametric regression estimators

✜ Local linear regression has been extended to the local polynomialregression estimator. Vast literature. 23 years of catch-up.


Recent Books

✜ J Fan, I Gijbels, Local Polynomial Modeling, Chapman & Hall, 1996

✜ JS Simonoff, Smoothing Methods in Statistics, Springer, 1996

✜ TP Ryan, Modern Regression Methods, Wiley, 1997

✜ C Loader, Local Regression and Likelihood, Springer, 1999

✜ Background in theoretical statistics useful.

✜ Much work is very mathematical and rigorous.

✜ We can learn a lot!


P X x F x P a X b f x dx

f xd

dxF x

F x h F x h

h

a

b

h

£( ) = ( ) < <( ) = ( )

( ) = ( ) = +( ) - -( )Ú

Æ

,

lim0 2

ˆ #

ˆ # ,

,

,

, ,

F xx x

n

f xx x h x h

hn nhK

x x

h

K uu

K K u K u du

i

i i

i

n

( ) = £{ }

( ) = Œ - +( ){ } = -ÊË

ˆ¯

( ) =- < £Ï

ÌÓ

≥ = ( ) =

=Â

Ú Ú

21

1 1

0

0 1 0

1

12

otherwise

Bin endpoints: b

h b b

Fx b

n

fx b b

nh

i i

n

i i

i

j i

i

j i i

{ }= -

=£{ }

=Œ( ]{ }

=

-

-

0

1

1

ˆ #

ˆ # ,

Histogram: Kernel Estimate:

Agree: x b bi i= +( )+12 1

Finite Sample: xi i

n{ } =1

From Histograms to Smoothing Kernels


Analysis of CD Rates for

Long Island

From Simonoff, chs 2 and 3

K=GaussianK=Square

Histogram


Parametric

Regression Fits to

Motorcycle Crash

Head Acceleration

Data

From Fan & Gijbels

Y a a X

a X

i i

i

= + +

+ +0 1

22 ... error


Nonparametric Regression Fits

Y a x a x X x h x hi i= ( ) + ( ) + Œ - +( )0 1 error, Xi ,

From Fan & Gijbels


Connecting to the Statistical Literature

✜ Statistical viewpoint of theresults of a numerical simulation.

✜ Model the output of the code as afluctuation about some presumedcorrect value.

✜ We don’t know what the truevalue is, but will use statisticalmethods to make the bestdetermination.

✜ Slight departure from tradition:

Y u X X

X

f x X

Y

u

X

E Var

x

Y x

= ( ) + ( )Œ( )

Œ

Œ

( ) = ( ) =( )

J e

ee

e e

J

—

—

—

d

m

m

: spatial sample point

distribution of

: code output

: true solution (mean)

: random error source

and independent

error scale,

conditional variance of

given

0

:

,

:

0 1

0

Y u X xi i i i´ ´,


Preliminary Notions

p x x y x

p x x x y y x x

J x px

py

px

p

e g J x x

x x

T n

iT

i i in

n n

( ) = [ ]Œ( ) = - - -( )[ ] Œ( ) = ∂

∂∂∂

∂∂

ÈÎÍ

˘˚˙ Œ ƒ

( ) =È

1 2

1 2

1 0 0

1 0

1

2

2

2

2

12

2

, , , !,

, , , !,

, , , ,

. .

K

K

K

—

—

— —

ÎÎ

ÍÍÍ

˘

˚

˙˙˙

( ) = --

È

Î

ÍÍÍ

˘

˚

˙˙˙

( ) = ( ) ( )

-

-

, J x x

x x

p x J x p xi i

1

12

2

1

1 0 0

1 0

1

(Taylor monomials)

(Reproducing basis)

No “shifting” necessary.


A ModifiedLocal Polynomial Regression

Estimate of and derivatives:

Taylor expansion of from to

Weighted Taylor objective:

Condition for optimum

Solution for estimates

Vector of shape functions

u x

u x x x p x

x u x p x W x

x

x

x u x

x B x p x

m n

j i

j ij

j

j jT

j

i i

bb

b

bb

b y

y

( ) Œ ƒ

( ) ( )

( ) = - ( ) ( )( ) ( )

∂ ( )∂ ( )

=

( ) = ( )

( ) = ( ) ( )

Â

Â-

— —

:

:

:

:

L

L

2

1

0

WW x

B x p x p x W x

in

j jT

jj

n n

( ) Œ

( ) = ( ) ( ) ( ) Œ ƒÂ

—

— —


Properties of LPR Estimators

Columns of are derivatives e e

Partition of Identity Matrix

Reproducing condition

Equivalence to MLS

0T

1T

j

b

b b

y

l b l

l b

= [ ] = [ ]ª ∫ ∂ ª ∫

( ) =

= ( )fi =

= fi = ∂

( ) = ( )

Â

-

1 0 0 0 1 0

0 11

1

, , , , , , ,

ˆ, ˆ ,

,

( )

K K

Ku e u u e u

p x I

u p x J

e p

p x J p x B

x

j jT

n

jT

j

a xa

j j ==

fi = ( ) ( ) ∫

- -

-

J A J

e p x A Je W x

T

jT

j

T

j j

1

01

0y f

These properties hold if p is any set of independent functions!


Advantages for Derivatives

✜ LPR is different from MLS in howderivatives are estimated

MLS: real derivative of a functionestimate

LPR: least squares fit for allderivatives simultaneously

✜ MLS derivatives tend to be noisy

✜ LPR derivatives tend to besmoother


LPR Diffusion Method (Haque)

✜ Linear diffusion equation, collocation solution

✜ Initialize numerical problem with analytic solution at t=.01 andkkkk =1.

∂∂

= ∂∂

= +Ê

ËÁˆ

¯̃

( ) = ( )

+ Â

T

t

T

x

T T t T

T x Q x

ik

ik

jk

jj

k

k y

d

2

2

1 2

0

D

,


Numerical Solutions

(Haque)

✜ 40% randomly perturbed grid

✜ MLS is less stable than LPR

✜ LPR compares well with CFD onmuch finer uniform grid.


LPR and the Galerkin Approach toConservation Laws

✜ Regrettably, LPR is not magic.

✜ Use of LPR directly in place of MLS in the Galerkin approachproduces hardly a noticeable difference in shock solutions.

✜ Some care must be taken to do the Galerkin approximationcorrectly, as there is no explicit derivative in LPR.

✜ End result is almost identical to MLS in 1D.

tr t

t t y y

t t y y

y y

˙

,

˙

˙

U F

F F

m U F

m U F

i ii

j jj

i i ii

i j i jij

i i j i jj

Ú ÚÂ Â

Â ÚÂ

ÚÂ

= — ◊

= — ◊ = ◊

= ◊

= ◊

0

0

0

v

v

v

t r t

t t y y

t t y y

y y

˙

,

˙

˙

U F

F F

m U F

m U F

i ii

j jj

i i ii

i j j iij

i i j j ij

Ú ÚÂ Â

Â ÚÂ

ÚÂ

= - — ◊

— = = ◊

= - ◊

= - ◊

v

v

v

0

0

0

f y f yi i i i´ — ´0 ,v

Type I Type II


✜ MSE is the figure of merit for an estimator:

✜ For LPR these have the asymptotic values

Asymptotic estimates for LPR

u x E Y X x x Var Y X x

Bias u x E u x u x

Var u x E u x E u x

MSE x E u x

( ) = =( ) ( ) = =( )( ){ } ∫ ( )( ) - ( )

( ){ } ∫ ( ) - ( )( )[ ]ÊË

ˆ¯

( ) = ( )

( ) ( ) ( )

( ) ( ) ( )

( )

,

ˆ ˆ

ˆ ˆ ˆ

ˆ

Jn n n

n n n

n

2

2

2

◊ ◊

◊ ◊ ◊

-- ( )[ ]( )= ( ){ } + ( ){ }

( )

( ) ( )

u x

Bias u x Var u x

n

n n

2

0

2

0

◊

◊ ◊ˆ ˆ

h N h

Var u x C K nx

f x N h

Bias u x C K n u x h n

Bias u x C K n u x

n

Æ Æ•

( ){ }Æ ( ) ( )( )

( ){ }Æ ( ) ( ) -

( ){ }Æ ( ) (

( )+

( ) +( ) + -

( ) +( )

0

1

2

1 2

21 1

32

,

ˆ , ,

ˆ , , ,

ˆ , ,

nn

n n n

n n

n J

n n

n

◊

◊

◊

odd

))+ +( ) ( ) ¢( )( )

È

ÎÍ

˘

˚˙ -+( ) + -n u x

f x

f xh nn2 1 2n n n, even


Asymptotics and Smoothing Length

✜ large bias (error) at large h,

✜ large variance (oscillation) at small h

✜ Constant trade-off between error and oscillation versus h.

✜ An optimal choice of smoothing length is possible by minimizingthe MSE:

✜ Contains unknown quantities. Many techniques for estimatingthese. Art form.

✜ For first derivatives, .

✜ Best to use odd. Same variance and bias as next-lowesteven but more parameters to fit with.

h x C K nx

u x f xNopt

nn

n

nn J( ) = ( ) ( )( ){ } ( )

È

Î

ÍÍ

˘

˚

˙˙+( )

+ -+

4

2

1 2

1

2 3 1

2 3, ,

n -n

Bias hnµ + -1 n

Var hµ +1 1 2n

f L N L x h xopt= = µ1 1 5, , /D D


Effectiveness of Proper Smoothing LengthFinding the Signal in the Noise with a Certain Bandwidth Selector

From Fan & Gijbels


Help from Statistics

✜ Statisticians have well- developed notions that may be of use inthe analysis of experimental and computational data formechanics

✜ Decreasing sensitivity to outliers with LOWESS

✜ Maximum likelihood

✜ Confidence intervals

✜ Optimality best smoother is LPR

✜ Assymptotic equivalent kernel


Reducing Influence of Outliers withLOWESS

✜ Weights can be manipulated toreduce the effect of outliers.

u e u

x u x

x B x p x W x

B x p x p x W x

j jT

j

i i i

j jT

jj

ª ∫

( ) = ( )

( ) = ( ) ( ) ( )( ) = ( ) ( ) ( )

Â

Â

-

b

b y

y

0

1

ˆ

Iteratively

de-weight the

particles with

large residuals

W x W x Lr

R

r u x u

L x x

R median r

i iik

k

ik

k i i

k ik

i

N

( )Æ ( ) ÊËÁ

ˆ¯̃

= ( ) -( ) = -( )

= { }

-

+

=

ˆ 1

2 2

1

1

6

From Fan + Gijbels


Confidence Intervals

✜ It is possible toconstruct confidenceintervals for LPRestimators

✜ After folding throughtime integrator, ispossible to setconfidence limits on ahydro calculation?

ˆ ˆ ˆ, ,u x b x z V xn n

nn a n

( )-( ) - ( ) ± ( )1 2

1 2

ˆ

ˆ,

,

b x

V x

z

n

n

n

n

a

( )( )

-

Estimate for bias

Estimate for variance

Gaussian quantile1 2

From Fan + Gijbels


The Best Kernel and Smoother

✜ The optimal kernel for LPR is the Epanechnikov:

✜ Theorem: With this kernel, LPR is THE optimal linear smoother.

✜ Didn’t work to well for hydro

✜ Not smooth at neighbor loss?

✜ Continue to use cubic B-spline

K x c x( ) = ◊ -( )+1 2


Equivalent Kernel

✜ In continuum limit, one can showthat the LPR estimator convergesto an integral kernelapproximation:

✜ Order of error is at least n for allderivatives

✜ Of great theoretical utility.

y K y dy m n

f x Kx x

hdx f x O h

mm

x h

x hn

n n

nn

d n*

*

, ,( ) = <

( ) -ÊË

ˆ¯ = ( ) + ( )

-

-

+ ( )

Ú

Ú1

1

00

0

0

From Fan + Gijbels


LRE for Discontinuities(cf. Loader)

e l l l

e ee e

u uu u

u

u u

u

u u

u

u u uu u u u u

u u u u u

cc

c

c

c

c

c

nl c r

ln

l c r c

rn

r c l c

, ˆˆ

ˆˆ

ˆˆ

ˆ

ˆ , ˆ , ˆ, , ˆ , ˆ

, , ˆ , ˆ

( ) = - + ¢ - ¢¢

+ ¢¢ - ¢¢¢¢

( ) = ( ) £ ( )( ) £ (

( )( )

( )

0 1 2

L ))ÏÌÓÔ

e xx- ( )2 22p psin

center

left

right


Differential Equations and LRE

p t t

u F t u

u F t F t u

e F t e

e F t F t e

T= [ ]

= ( )= ( ) + ( )( )

ÏÌÓ

- ( ) =- ( ) + ( )( ) =

ÏÌÓ

1 2

0

0

2

2

1 0

22

0

, ,

˙

˙̇ ˙

˙b b

b b

ODE

p t x t t x x

u F u G u

u F u G u

u F u G u

e D F e e G e

e D F e

T

t x

t t x t

x t x x

u

u

= [ ]∂ + ∂ ( ) = ( )∂ + ∂ ∂ ( ) = ∂ ( )∂ ∂ + ∂ ( ) = ∂ ( )

Ï

ÌÔ

ÓÔ

+ ( ) = ( )+ ( )

1 12

2 12

2

2

2

1 0 2 0

4 0

, , , , ,

b b b b

b b bb b b b b b

b b b b b b b b

e D F e e e D G e e

e D F e e D F e e e D G e e

u u

u u u

52

0 1 2 0 1

5 0 62

0 2 2 0 1

+ ( ) = ( )+ ( ) + ( ) = ( )

Ï

ÌÔÔ

ÓÔÔ

PDE

g L

pr u D u D u

pr e e e

G

g x x

g

Œ

( ) =( ) =

, , ,

, , ,

2

0 1 2

0

0

L

Lb b b

Group Symmetries

u u u

u u u u u

e e e e e e e e e e

x x x

T T T T T

3 1 2

3 2 1 1 2

3 1 2 0 1 1 1 0 2 1

=∂ = ∂ + ∂

= ( )( ) + ( )( )b b b b b

Leibnitz’ Rule

p x

T ds d pdv

T s p v

e e e e e e e e e e

T

x x x

T s p v

= [ ]= +

∂ = ∂ + ∂

( )( ) = ( ) + ( )( )

1

0 1 1 0 1

, ,K

hh

b b b b bh

Thermodynamics

Differential relations are

algebraic constraints on

LRE solution D b( ) = 0


Parameterized Differential Relations

✜ The differential relations we are interested in can often haveunknown parameters associated with them

Eigenvalues

Material constants

Unknown design specs

✜ In general we write

D a b;( ) = 0


✜ In general, depends on the .

✜ Floating interior data are determined by minimization.

✜ Global implicit system results whose sources are boundary specs.

✜ The “stiffness matrix” is the coefficient of the in the expressionfor .

✜ A point on boundary can be used to specify boundary conditions.

✜ Invert the MLS matrix

L

B

L

x u x p x W x

x p x W x W

ux u x e u x

j j jj

i i iT

i i

ii i i i

( ) = - ( ) ( )( ) ( )

œ ( ) = [ ] ( ) =∂∂

( ) = fi = ( ) = ( )

Â b

b

2

0

0

1 0

0

, , , ,

ˆ

L

ImplicitExplicit

x

x H h

i

i i i

Œ

( ) =

B

b

b ui

Point Constraints and LRE

uib

fi jx( )


Tuned Regression EstimatorsGeneral Prescription

L

D

BLL

L B

x u x p x W x

e e

x H h i

u x i

j ij

j

T T

i i i

i i

( ) = - ( ) ( )( ) ( )

( ) ( )[ ] =( ) = Œ

∂ ∂ =∂ ∂ =

∂ ∂( )( ) = œ

Ï

Ì

ÔÔÔÔ

Ó

ÔÔÔÔ

Â b

a b bb

ab

2

0 1 0

0

0

0

; , ,

,

,

K

d x

m u

q

n r F

r G

u x D u x

d

d m

q

m n q r

d q r

x

space + time dimensions

unknowns

parameters

derivatives, equations

sources

differential relations

Œ

Æ

Œ

ƒ Æ

ƒ Æ

( ) ( )[ ] =

R

R R

R

R R R

R R R

D

:

:

:

; , ,

a

a K 0

Weighted error of local series expansion

Match the differential relations (tuning)

Best fit for parameters

Best fit for derivatives (explicitness)

Best fit for unknown data (implicitness)

Match point constraints


✜ Unified formalism for two modes of use:Forward modeling: Simulation + Prediction (physicists and engineers)

Inverse modeling: Analysis + Fitting (statisticians)

Allows blending both seemlesslyE.g. back out material characterizations using finite wave sample

Allows for noise, randomness, fuzziness in data

✜ Nonlinear tuning causes loss of factorization .

✜ Nonlinear tuning may preclude explicit solution of constraints.Iterative procedure likely required.

✜ Residual of differential relations is always zero.Price: estimate of derivative .ne. derivative of estimate

✜ Tuning causes elimination of components of , giving smallermatrices to invert, giving better answer faster.

Features of Tuned Regression

b y= Âuj jT

j

b


Forward Tuned Regression Example

✜ 201 points

✜ 40% random x

✜ h/dx = 1.0.

✜

✜

✜ Used tuning

✜ Inverted matrixto solve

˙cos sin

u u ex

x

x

xx= - + [ ] - [ ]ÊËÁ

ˆ¯̃

-420 20 204

2

∂ ∂ ( ) =L u xi i 0

b ±( ) = ±( )1 10e u


Inverse Tuned Regression Example

Quadratic LPR

Tuned

Quadratic LPR

ODE + ODE’

6 points 11 points˙.

.u u=

-

-

È

Î

ÍÍÍ

˘

˚

˙˙˙

225

25

2

p

p


Inverse Tuned Regression Convergence

✜ 6 to 81 points

✜ 30% random points and h’ssampled from 0 to 10

✜ Expect cubic convergence fromeach, as both are quadratic

✜ Typical methods lose an order ofconvergence on non-uniformgrids.

✜ LPR is no exception

✜ Tuning regains the lost order ofaccuracy.

✜ Tuning reduces 6 unknowns to 2.

✜ Invert 2x2 matrix instead of 3x3.

✜ Better answer at less cost.

✜ Can use to replace semi-discretization with a space-timeparticle method for hydro?

LPRTuned

LPR

-3.12521 - 3.00877 Log h-3.28456 - 2.12287 Log h


Summary

✜ Much MLS experience: successful mesh-based ideas may lacksome essential feature when transferred to meshless world.

✜ Statistical LPR brings meshless full-circle to its origins.

✜ Statistical literature is full of rigorous mathematical results whichcould inspire rigorous developments in meshless mechanics.

✜ Statistical literature inspires many intriguing naturally meshlessanalogues to mesh-based ideas.

✜ 23 years of literature to catch up on. Stone: 277 citations

✜ Lots of development yet for mechanics applications.

✜ Take small steps.ODE’S, advection, Poisson, Helmholz, Euler

Deterministic + stochastic

Convergence, stability, etc.

✜ Its not: Galerkin, collocation, differences. Its regression.

✜ Work enough for an army of students and post-docs.

local regression estimators: a forgotten heritagelos alamos hydrodynamic methods10/4/01 - 3...

Documents