ch11 curve fitting

30
Ch11 Curve Fitting Dr. Deshi Ye [email protected]

Upload: amber

Post on 11-Feb-2016

97 views

Category:

Documents


5 download

DESCRIPTION

Ch11 Curve Fitting. Dr. Deshi Ye [email protected]. Outline. The method of Least Squares Inferences based on the Least Squares Estimators Curvilinear Regression Multiple Regression. 11.1 The Method of Least Squares. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ch11 Curve Fitting

Ch11 Curve Fitting

Dr. Deshi [email protected]

Page 2: Ch11 Curve Fitting

2/30

Outline

The method of Least SquaresInferences based on the Least Squares EstimatorsCurvilinear RegressionMultiple Regression

Page 3: Ch11 Curve Fitting

3/30

11.1 The Method of Least Squares

Study the case where a dependent variable is to be predicted in terms of a single independent variable. The random variable Y depends on a random variable X. Regressing curve of Y on x, the relationship between x and the mean of the corresponding distribution of Y.

Page 4: Ch11 Curve Fitting

4/30

Linear regression

Page 5: Ch11 Curve Fitting

5/30

Linear regression

Linear regression: for any x, the mean of the distribution of the Y’s is given by x

In general, Y will differ from this mean, and we denote this difference as follows

Y x is a random variable and we can also choose

so that the mean of the distribution of this random is equal to zero.

Page 6: Ch11 Curve Fitting

6/30

EXx 1 2 3 4 5 6 7 8 9 10 11 12y 16 35 45 64 86 96 106 124 134 156 164 182

Page 7: Ch11 Curve Fitting

7/30

Analysisˆ

ˆi i i

y a bxe y y

1

n

ii

e as close as possible to zero.

Page 8: Ch11 Curve Fitting

8/30

Principle of least squares

2 2

1 1

( ( ))n n

i i ii i

e y a bx

Choose a and b so that

is minimum. The procedure of finding the equation of the line which best fits a given set of paired data, called the method of least squares. Some notations:

2

2 2 1

1 1

( )( )

n

in ni

xx i ii i

xS x x x

n

2

2 2 1

1 1

( )( )

n

in ni

yy i ii i

yS y y y

n

1 1

1 1

( )( )( )( )

n n

i in ni i

xy i i i ii i

x yS x x y y x y

n

Page 9: Ch11 Curve Fitting

9/30

Least squares estimators

, where , are the means of ,xy

xx

Sa y b x and b x y x y

S

Fitted (or estimated) regression line

y a bx Residuals: observation – fitted value= ( )i iy a bx

The minimum value of the sum of squares is called the residual sum of squares or error sum of squares. We will show that n

2

1

2

residual sum of squares= ( - - )

/

i ii

xy xy xx

SSE y a bx

S S S

Page 10: Ch11 Curve Fitting

10/30

EX solution

Y = 14.8 X + 4.35

Page 11: Ch11 Curve Fitting

11/30

X-and-Y

X-axis Y-axis independent dependent predictor predicted carrier response input output

Page 12: Ch11 Curve Fitting

12/30

Example

You’re a marketing analyst for Hasbro Toys. You gather the following data:

Ad $ Sales (Units)1 12 13 24 25 4

What is the relationship between sales & advertising?

Page 13: Ch11 Curve Fitting

13/30

01234

0 1 2 3 4 5

Scattergram Sales vs. Advertising

Sales

Advertising

Page 14: Ch11 Curve Fitting

14/30

the Least Squares Estimators

Page 15: Ch11 Curve Fitting

15/30

11.2 Inference based on the Least Squares Estimators

We assume that the regression is linear in x and, furthermore, that the n random variable Yi are independently normally distribution with the means Statistical model for straight-line regression

i i iY x

ix

i are independent normal distributed random variable having zero means and the common variance 2

Page 16: Ch11 Curve Fitting

16/30

Standard error of estimate

The i-th deviation and the estimate of is

2

2 2

1

1 [ ( )]2

n

e i ii

S y a bxn

Estimate of can also be written as follows2

2

2

( )

2

xyyy

xxe

SS

SS

n

Page 17: Ch11 Curve Fitting

17/30

Statistics for inferences: based on the assumption made concerning the distribution of the values of Y, the following theorem holds. Theorem. The statistics

2

( ) ( )( )

xxxx

e xx e

nSa bt and t Ss S n x s

are values of random variables having the t distribution with n-2 degrees of freedom.

Confidence intervals2

/ 2

/ 2

1 ( ):

1:

exx

exx

xa t sn S

b t sS

Page 18: Ch11 Curve Fitting

18/30

Example

The following data pertain to number of computer jobs per day and the central processing unit (CPU) time required.

Number of jobsx

CPU timey

12345

254910

Page 19: Ch11 Curve Fitting

19/30

EX

1) Obtain a least squares fit of a line to the observations on CPU time

2, 0xy

xx

Sb a y bx

S 2y x

Page 20: Ch11 Curve Fitting

20/30

Example2) Construct a 95% confidence interval for α

22 / 46 400 /10 2

2 3yy xy xx

e

S S Ss

n

The 95% confidence interval of α, / 2 0.025 3.182t t

2

/ 21 1 90 3.182 * 2 * 4.72

5 10exx

xa t sn S

Page 21: Ch11 Curve Fitting

21/30

Example

3) Test the null hypothesis against the alternative hypothesis at the 0.05 level of significance.

1

1

Solution: the t statistic is given by

( ) 2 1 10 2.2362xx

e

bt Ss

Criterion: 0.05 2.353t t

Decision: we cannot reject the null hypothesis

Page 22: Ch11 Curve Fitting

22/30

11.3 Curvilinear Regression

Regression curve is nonlinear.Polynomial regression:

20 1 2

ppY x x x

Y on x is exponential, the mean of the distribution of values of Y is given by xy

Take logarithms, we have log log logy x

Thus, we can estimate by the pairs of value , ( , log )i ix y

Page 23: Ch11 Curve Fitting

23/30

Polynomial regression

If there is no clear indication about the function form of the regression of Y on x, we assume it is polynomial regression

20 1 2

kkY a a x a x a x

Page 24: Ch11 Curve Fitting

24/30

Polynomial Fitting•Really just a generalization of the previous case•Exact solution•Just big matrices

Page 25: Ch11 Curve Fitting

25/30

11.4 Multiple Regression

0 1 1 2 2 k kb b x b x b x The mean of Y on x is given by

0 1 1 2 2

21 0 1 1 1 2 1 2

22 0 2 1 1 2 2 2

y nb b x b x

x y b x b x b x x

x y b x b x x b x

20 1 1

1

[ ( )]n

i i k iki

y b b x b x

Minimize

We can solve it when r=2 by the following equations

Page 26: Ch11 Curve Fitting

26/30

Example

P365.

Page 27: Ch11 Curve Fitting

27/30

Multiple Linear Fitting

X1(x), . . .,XM(x) are arbitrary fixed functions of x (can be nonlinear), called the basis functions

normal equations of the least squaresproblem

Can be put in matrix form and solved

Page 28: Ch11 Curve Fitting

28/30

Correlation Models

1. How strong is the linear relationship between 2 variables?2. Coefficient of correlation used

Population correlation coefficient denoted Values range from -1 to +1

Page 29: Ch11 Curve Fitting

29/30

Correlation

Standardized observation

The sample correlation coefficient r

Observation - Sample meanSample standard deviation

i

x

x xs

1

1 ( )( )1

ni i

i x y

x x y yrn s s

Page 30: Ch11 Curve Fitting

30/30

Coefficient of Correlation Values

-1.0-1.0 +1.0+1.000-.5-.5 +.5+.5

No No CorrelationCorrelation

Increasing degree of Increasing degree of negative correlationnegative correlation

Increasing degree of Increasing degree of positive correlationpositive correlation