chapter 14 general linear squares and nonlinear regression

Chapter 14Chapter 14

General Linear Squares andGeneral Linear Squares and Nonlinear Nonlinear

RegressionRegression

y = 20.5717 +3.6005x

Error Sr = 4201.3

Correlation r = 0.4434

x = [-2.5 3.0 1.7 -4.9 0.6 -0.5 4.0 -2.2 -4.3 -0.2];

y = [-20.1 -21.8 -6.0 -65.4 0.2 0.6 -41.3 -15.4 -56.1 0.5];

Preferable to Preferable to fit a parabolafit a parabola

Large error, Large error, poor correlationpoor correlation

Polynomial RegressionPolynomial Regression Quadratic Least Squares y = f(x) = a0+ a1x + a2x2

Minimize total square error

n

1i

22i2i10i210r xaxaayaaaS )(),,(

n

1i

2i2i10i

2i

2

r

n

1i

2i2i10ii

1

r

n

1i

2i2i10i

0

r

xaxaayx20a

S

xaxaayx20a

S

xaxaay20a

S

Quadratic Least SquaresQuadratic Least Squares

Use Cholesky decomposition to solve for the symmetric matrix

or use MATLAB function z = A\r

n

1ii

2i

n

1iii

n

1ii

2

1

0

n

1i

4i

n

1i

3i

n

1i

2i

n

1i

3i

n

1i

2i

n

1ii

n

1i

2i

n

1ii

yx

yx

y

a

a

a

xxx

xxx

xxn

Standard error for 2nd polynomial regression

/ 3r

y x

Ss

n

where

• n observations

•2nd order polynomial (3 coefficients)

(start off with n degrees of freedom, use up m+1 for mth-order polynomial)

» [x,y]=example2» z=Quadratic_LS(x,y) x y (a0+a1*x+a2*x^2) (y-a0-a1*x-a2*x^2) -2.5000 -20.1000 -18.5529 -1.5471 3.0000 -21.8000 -22.0814 0.2814 1.7000 -6.0000 -6.3791 0.3791 -4.9000 -65.4000 -68.6439 3.2439 0.6000 0.2000 -0.2816 0.4816 -0.5000 0.6000 -0.7740 1.3740 4.0000 -41.3000 -40.4233 -0.8767 -2.2000 -15.4000 -14.4973 -0.9027 -4.3000 -56.1000 -53.1802 -2.9198 -0.2000 0.5000 0.0138 0.4862err = 25.6043Syx = 1.9125r = 0.9975z = 0.2668 0.7200 -2.7231 y = 0.2668 + 0.7200 x - 2.7231 x2

Correlation coefficient r

Standard error of the estimate

function [x,y] = example2

x = [ -2.5 3.0 1.7 -4.9 0.6 -0.5 4.0 -2.2 -4.3 -0.2];

y = [-20.1 -21.8 -6.0 -65.4 0.2 0.6 -41.3 -15.4 -56.1 0.5];

Quadratic Least Square:

y = 0.2668 + 0.7200 x 2.7231 x2

Error Sr = 25.6043Correlation r = 0.9975

Cubic Least SquaresCubic Least Squares

n

1ii

3i

n

1ii

2i

n

1iii

n

1ii

3

2

1

0

n

1i

6i

n

1i

5i

n

1i

4i

n

1i

3i

n

1i

5i

n

1i

4i

n

1i

3i

n

1i

2i

n

1i

4i

n

1i

3i

n

1i

2i

n

1ii

n

1i

3i

n

1i

2i

n

1ii

yx

yx

yx

y

a

a

a

a

xxxx

xxxx

xxxx

xxxn

n

1i

23i3

2i2i10r

33

2210

xaxaxaayS

xaxaxaaxf

)(

)(

» [x,y]=example2;

» z=Cubic_LS(x,y)

x y p(x)=a0+a1*x+a2*x^2+a3*x^3 y-p(x)

-2.5000 -20.1000 -19.9347 -0.1653

3.0000 -21.8000 -21.4751 -0.3249

1.7000 -6.0000 -5.0508 -0.9492

-4.9000 -65.4000 -67.4300 2.0300

0.6000 0.2000 0.5842 -0.3842

-0.5000 0.6000 -0.8404 1.4404

4.0000 -41.3000 -41.7828 0.4828

-2.2000 -15.4000 -15.7997 0.3997

-4.3000 -56.1000 -53.2914 -2.8086

-0.2000 0.5000 0.2206 0.2794

err =

15.7361

Syx =

1.6195

r =

0.9985

z =

0.6513 1.5946 -2.8078 -0.0608

y = 0.6513 + 1.5946x – 2.8078x2 0.0608x3

Correlation coefficient r = 0.9985

» [x,y]=example2;

» z1=Linear_LS(x,y); z1

z1 =

-20.5717 3.6005

» z2=Quadratic_LS(x,y); z2

z2 =

0.2668 0.7200 -2.7231

» z3=Cubic_LS(x,y); z3

z3 =

0.6513 1.5946 -2.8078 -0.0608

» x1=min(x); x2=max(x); xx=x1:(x2-x1)/100:x2;

» yy1=z1(1)+z1(2)*xx;

» yy2=z2(1)+z2(2)*xx+z2(3)*xx.^2;

» yy3=z3(1)+z3(2)*xx+z3(3)*xx.^2+z3(4)*xx.^3;

» H=plot(x,y,'r*',xx,yy1,'g',xx,yy2,'b',xx,yy3,'m');

» xlabel('x'); ylabel('y');

» set(H,'LineWidth',3,'MarkerSize',12);

» print -djpeg075 regres4.jpg

Linear Least Square

Quadratic Least Square

Cubic Least Square

Linear Least Square: y = – 20.5717 + 3.6005xQuadratic: y = 0.2668 + 0.7200 x 2.7231x2

Cubic: y = 0.6513 + 1.5946x – 2.8078x2 0.0608x3

Standard error for polynomial regression

1mn

Ss r

xy /

where

• n observations

• m order polynomial

(start off with n degrees of freedom, use up m+1 for mth-order polynomial)

Dependence on more than one variable

e.g. dependence of runoff volume on soil type and land cover,

or dependence of aerodynamic drag on automobile shape and speed

0 1 1 2 2

0 1 1 2 2( )i i

y a a x a x

e y a a x a x

Multiple Linear RegressionMultiple Linear Regression


With two independent variables, get a surface Find the best-fit “plane” to the data

2

i22i110ir xaxaayS ,,

i22i110ii22

r

i22i110ii11

r

i22i110i0

r

xaxaayx20a

S

xaxaayx20a

S

xaxaay20a

S

,,,

,,,

,,


Much like polynomial regression Sum of squared residuals

Rearrange the equations

Very similar to polynomial regression

n

1ii

2i

n

1iii

n

1ii

2

1

0

n

1i

4i

n

1i

3i

n

1i

2i

n

1i

3i

n

1i

2i

n

1ii

n

1i

2i

n

1ii

yx

yx

y

a

a

a

xxx

xxx

xxn

n

1iii2,

n

1iii1,

n

1ii

2

1

0

n

1i

2i2,

n

1ii2i1,

n

1ii2,

n

1ii2i1,

n

1i

2i1,

n

1ii1,

n

1ii2,

n

1ii1,

yx

yx

y

a

a

a

xxxx

xxxx

xxn

,

,

Multiple Linear RegressionMultiple Linear Regression Once again, solve by any matrix method Cholesky decomposition is appropriate -

symmetric and positive definite

Very useful for fitting power equation

mm22110

am

a2

a10

xaxaxaay

xxxay m21

logloglogloglog

Example: Strength of concrete depends on cure time and cement/water ratio (or water content W/C)

cure time days W/C strength psi2 0.42 27704 0.55 26395 0.7 2519

16 0.53 34503 0.61 23157 0.67 25458 0.55 2613

27 0.66 369414 0.42 341420 0.58 3634

» x1=[2 4 5 16 3 7 8 27 14 20];» x2=[0.42 0.55 0.7 0.53 0.61 0.67 0.55 0.66 0.42 0.58];» y=[2770 2639 2519 3450 2315 2545 2613 3694 3414 3634];» H=plot3(x1,x2,y,'ro'); grid on; set(H,'LineWidth',5);» H1=xlabel('Cure Time (days)'); set(H1,'FontSize',12)» H2=ylabel('Water Content'); set(H2,'FontSize',12)» H3=zlabel('Strength (psi)'); set(H3,'FontSize',12)

cure time days W/C strength psi x1*x2 x1^2 x2^2 x1*y x2*y2 0.42 2770 0.84 4 0.1764 5540.35 1163.4734 0.55 2639 2.2 16 0.3025 10557.82 1451.75 0.7 2519 3.5 25 0.49 12592.77 1762.988

16 0.53 3450 8.48 256 0.2809 55195.7 1828.3583 0.61 2315 1.83 9 0.3721 6944.629 1412.0757 0.67 2545 4.69 49 0.4489 17815.74 1705.228 0.55 2613 4.4 64 0.3025 20900.17 1436.886

27 0.66 3694 17.82 729 0.4356 99729.22 2437.82514 0.42 3414 5.88 196 0.1764 47793.4 1433.80220 0.58 3634 11.6 400 0.3364 72683.15 2107.811

sum(x1) sum(x2) sum(y) sum(x1x2) sum(x1^2)sum(x2^2) sum(x1y) sum(x2y)106 5.69 29592 61.24 1748 3.3217 349752.9 16740.14

16740

349753

29592

a

a

a

3232461695

24611748106

69510610

2

1

0

...

.

.

n

1iii2,

n

1iii1,

n

1ii

2

1

0

n

1i

2i2,

n

1ii2i1,

n

1ii2,

n

1ii2i1,

n

1i

2i1,

n

1ii1,

n

1ii2,

n

1ii1,

yx

yx

y

a

a

a

xxxx

xxxx

xxn

,

,

Hand CalculationsHand Calculations

Solve by Cholesky decomposition

29000

04099240

8015233163

290040801

099245233

00163

3232461695

24611748106

69510610

.

..

...

...

..

.

...

.

.

Forward and Back Substitutions

1827

60

3358

a

a

a

2

1

0

)/()()( CW1827days cure603358psistrength

» [x1,x2,y]=concrete;» z=Multi_Linear(x1,x2,y) x1 x2 y (a0+a1*x1+a2*x2) (y-a0-a1*x1-a2*x2) 2 0.42 2770 2711.3 58.652 4 0.55 2639 2594.7 44.267 5 0.7 2519 2381.1 137.94 16 0.53 3450 3357.3 92.72 3 0.61 2315 2424.6 -109.57 7 0.67 2545 2556.9 -11.895 8 0.55 2613 2836.7 -223.73 27 0.66 3694 3785.2 -91.158 14 0.42 3414 3437.3 -23.339 20 0.58 3634 3507.9 126.11Syx = 130.92r = 0.97553z =

3358 60.499 -1827.8

function [x1,x2,y] = concrete

x1=[2 4 5 16 3 7 8 27 14 20];

x2=[0.42 0.55 0.7 0.53 0.61 0.67 0.55 0.66 0.42 0.58];

y=[2770 2639 2519 3450 2315 2545 2613 3694 3414 3634];

)/(.)(.)( CW 81827days cure 499603358psi strength

Correlation coefficient

(a0 , a1 , a2)


)/(.)(.)( CW 81827days cure 499603358psi strength

» xx=0:0.02:1; yy=0:0.02:1; [x,y]=meshgrid(xx,yy);» z=2*x+3*y+2;» surfc(x,y,z); grid on» axis([0 1 0 1 0 7])» xlabel('x1'); ylabel('x2'); zlabel('y')

Simple linear, polynomial, and multiple linear regressions are special cases of the general linear least squares model

Examples:

Linear in ai , but zi may be highly nonlinear

ezazazazay mm221100 ...

General Linear Least SquaresGeneral Linear Least Squares

xaxay

t at aay

12

0

210

sin

)sin()cos(

General equation in matrix form

Where

n11

T

m11T

n11T

mnn1n0

2m1202

1m1101

eeee

aaaa

yyyy

zzz

zzz

zzz

Z

e a Zy


Dependent variables

Regression coefficients

Residuals

As usual, take partial derivatives to minimize the square errors Sr

This leads to the normal equations

Solve this for {A} using Cholesky LU decomposition, or matrix inverse

yZ a ZZ TT


2n

1i

m

0jjijir zayS

Use Taylor series expansion to linearize the original equation

Gauss-Newton method Nonlinear function of a1, a2, …, am

Where f is a nonlinear function of x (xi, yi) are one of a set of n observations

im10ii eaaaxfy ,...,,;

Nonlinear RegressionNonlinear Regression

Use Taylor series for f, and truncate the higher-order terms

j = the initial guess j+1 = the prediction (improved guess)

m

m

ji

11

ji

00

ji

ji1ji

im21ii

aa

xfa

a

xfa

a

xfxfxf

eaaaxfy

),,,;(


Plug the Taylor series into original equation

or

im

m

ji1

1

ji0

0

ji

jii eaa

xfa

a

xfa

a

xfxfy

im

m

ji1

1

ji0

0

ji

jii eaa

xfa

a

xfa

a

xfxfy


nm

m

jn1

1

jn0

0

jn

jnn

2mm

j21

1

j20

0

j2

j22

1mm

j11

1

j10

0

j1

j11

eaa

xfa

a

xfa

a

xfxfy

eaa

xfa

a

xfa

a

xfxfy

eaa

xfa

a

xfa

a

xfxfy

EAZD j

Gauss-Newton MethodGauss-Newton Method Given all n equations

Set up matrix equation

e

e

e

E ;

a

a

a

A ;

xfy

xfy

xfy

D

n

2

1

m

1

0

jnn

j22

j11

where

a

f

a

f

a

f

a

f

a

f

a

fa

f

a

f

a

f

a

xf

a

xf

a

xf

a

xf

a

xf

a

xfa

xf

a

xf

a

xf

Z

m

n

1

n

0

n

m

2

1

2

0

2

m

1

1

1

0

1

m

jn

1

jn

0

jn

m

j2

1

j2

0

j2

m

j1

1

j1

0

j1

j

EAZD j

Using the same least squares approach Minimizing sum of squares of residuals e

Get A from

Now modify a1, a2, …, am with A and repeat the procedure until convergence is reached

DZA ZZ Tj

Tj

DZ ZZ A Tj

1

jT

j

Gauss-Newton MethodGauss-Newton Method

function [x,y] = mass_spring

x = [0.00 0.11 0.18 0.25 0.32 0.44 0.55 0.61 0.68 0.80 ...

0.92 1.01 1.12 1.22 1.35 1.45 1.60 1.67 1.76 1.83 2.00];

y = [1.03 0.78 0.62 0.22 0.05 -0.20 -0.45 -0.50 -0.45 -0.31 ...

-0.21 -0.11 0.04 0.12 0.22 0.23 0.18 0.10 0.07 -0.02 -0.10];

Example: Damped SinusoidalExample: Damped Sinusoidal

Model it with )cos( xaey 1xa0

)sin(

)cos(

)cos(

xaxea

f

xaxea

f

xaexf

1xa

1

1xa

0

1xa

0

0

0

)sin()cos(

)sin()cos(

)sin()cos(

n1xa

n1xa

n1xa

21xa

n1xa

11xa

1

n

0

n

1

2

0

2

1

1

0

1

xaxexaxe

xaxexaxe

xaxexaxe

a

f

a

f

a

f

a

fa

f

a

f

Z

n0n0

n020

n010

)cos(

)cos(

)cos(

n1xa

n

21xa

2

11xa

1

xaey

xaey

xaey

D

0

0

0

Gauss-Newton MethodGauss-Newton Method

» [x,y]=mass_spring;

» a=gauss_newton(x,y)

Enter the initial guesses [a0,a1] = [2,3]

Enter the tolerance tol = 0.0001

Enter the maximum iteration number itmax = 50

n =

21

iter a0 a1 da0 da1

1.0000 2.1977 5.0646 0.1977 2.0646

2.0000 1.0264 3.9349 -1.1713 -1.1296

3.0000 1.1757 4.3656 0.1494 0.4307

4.0000 1.1009 4.4054 -0.0748 0.0398

5.0000 1.1035 4.3969 0.0026 -0.0085

6.0000 1.1030 4.3973 -0.0005 0.0003

7.0000 1.1030 4.3972 0.0000 0.0000

Gauss-Newton method has converged

a =

1.1030 4.3972

).cos(. x39724exf x10301

Choose initial a0 = 2, a1 = 3

21 data points

).cos(. x39724exf x10301

a0 = 1.1030, a1 = 4.3972

chapter 14 general linear squares and nonlinear regression

Documents

y x y px

example2 x

y x y a0 a1

squares w y

square slide

estimate function x

y z1 z1

y z2 z2