an introduction to splines - simon fraser...
TRANSCRIPT
An Introduction to Splines
Trinity River Restoration ProgramWorkshop on Outmigration: Population Estimation
October 6–8, 2009
An Introduction to Splines
1 Linear RegressionSimple Regression and the Least Squares MethodLeast Squares Fitting in RPolynomial Regression
2 Smoothing SplinesSimple SplinesB-splinesOverfitting and Smoothness
An Introduction to Bayesian Inference
1 Linear RegressionSimple Regression and the Least Squares MethodLeast Squares Fitting in RPolynomial Regression
An Introduction to Bayesian Inference
1 Linear RegressionSimple Regression and the Least Squares MethodLeast Squares Fitting in RPolynomial Regression
Simple Linear RegressionDaily temperatures in Montreal from April 1 (Day 81) to June 30 (Day191), 1961.
●
●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
100 120 140 160 180
05
1015
20
Montreal Temp. −− April 1 to June 30, 1961
Day of Year
Tem
pera
ture
Introduction to Splines: Linear Regression, Simple Regression and the Least Squares Method 5/52
Simple Linear RegressionThe Model
Assumptions
Mean On average, the change in the response isproportional to the change in the predictor.
Errors 1. The deviation in the response for anyobservation does not depend on any otherobservation.
2. The average magnitude of the deviation is thesame for all values of the predictor.
Mathematically
For i = 1, . . . , n:yi = β0 + β1xi + εi
where ε1, . . . , εn are independent with mean 0 and variance σ2.
Introduction to Splines: Linear Regression, Simple Regression and the Least Squares Method 6/52
The Least Squares MethodExample: The Montreal Data
●
●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
100 120 140 160 180
05
1015
20
Montreal Temp. −− April 1 to June 30, 1961
Day of Year
Tem
pera
ture
●
●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
100 120 140 160 1800
510
1520
Montreal Temp. −− April 1 to June 30, 1961
Day of Year
Tem
pera
ture
Introduction to Splines: Linear Regression, Simple Regression and the Least Squares Method 7/52
The Least Squares MethodThe Residuals
DefinitionGiven values for β0 and β1, the residual for the i th observation isthe difference between the observed and the predicted response:
ei = yi − yi
where yi = β0 + β1xi .
Introduction to Splines: Linear Regression, Simple Regression and the Least Squares Method 8/52
The Least Squares MethodThe Least Squares Criterion
The least squares method defines the best values of β0 and β1 tobe those that minimize the sum of the squared residuals:
SS =n∑
i=1
e2i =
n∑i=1
(yi − yi )2.
Introduction to Splines: Linear Regression, Simple Regression and the Least Squares Method 9/52
The Least Squares MethodExample: The Montreal Data
●
●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
100 120 140 160 180
05
1015
20
Montreal Temp. −− April 1 to June 30, 1961
Day of Year
Tem
pera
ture
●
●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
100 120 140 160 180
05
1015
20
Montreal Temp. −− April 1 to June 30, 1961
Day of Year
Tem
pera
ture
SS=1549.37 SS=1148.56
Introduction to Splines: Linear Regression, Simple Regression and the Least Squares Method 10/52
An Introduction to Bayesian Inference
1 Linear RegressionSimple Regression and the Least Squares MethodLeast Squares Fitting in RPolynomial Regression
Least Squares Fitting in RThe Data
Suppose that the data is a data frame with elements:
I x: the days from 90 to 181
I y: the observed temperatures
> data = read.table("MontrealTemp1.txt")> summary(data)
x yMin. : 90.0 Min. : -0.901st Qu .:112.8 1st Qu.: 5.60Median :135.5 Median :11.55Mean :135.5 Mean :11.463rd Qu .:158.2 3rd Qu .:16.70Max. :181.0 Max. :23.60
>
Introduction to Splines: Linear Regression, Least Squares Fitting in R 12/52
Least Squares Fitting in RFitting the Model
Fitting the model with lm:
> lm(y~x,data)
Call:lm(formula = y ~ x, data = data)
Coefficients:(Intercept) x
-15.3996 0.1982
Introduction to Splines: Linear Regression, Least Squares Fitting in R 13/52
Least Squares Fitting in RFitting the Model
Fitting the model with lm:
> lmfit = lm(y~x,data)> attributes(lmfit)$names[1] "coefficients" "residuals"[3] "effects" "rank"[5] "fitted.values" "assign"[7] "qr" "df.residual"[9] "xlevels" "call"
[11] "terms" "model"
$class[1] "lm"
>
Introduction to Splines: Linear Regression, Least Squares Fitting in R 14/52
Least Squares Fitting in RThe Fitted Line
Plotting the fitted line over the raw data:
# Plot the raw data
> plot(data$x,data$y,main="Montreal Temp. ...",xlab="Day of Year",ylab="Temperature")
# Add the fitted line
> lines(data$x,lmfit$fit ,col="red",lwd =3)
Introduction to Splines: Linear Regression, Least Squares Fitting in R 15/52
Least Squares Fitting in RThe Fitted Line
●
●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
100 120 140 160 180
05
1015
20
Montreal Temp. −− April 1 to June 30, 1961
Day of Year
Tem
pera
ture
Introduction to Splines: Linear Regression, Least Squares Fitting in R 16/52
The Least Squares MethodGoodness-of-Fit Testing
Residual Diagnostics
The value of the residuals should not depend on x or y in anysystematic way.
I Common indications of lack of fit:I trends with x or y (curves or clusters of high/low values)I constant increase/decrease (funnel shape)I increase followed by decrease (football shape)I very large (+ or -) values (outliers)
I Assessed by plotting e versus x and y .
Introduction to Splines: Linear Regression, Least Squares Fitting in R 17/52
Least Squares Fitting in RResidual Plots
Plotting the residuals versus the predictor and response:
## Plot the residuals versus day
> plot(data$x,lmfit$resid ,xlab="Day of Year",ylab="Residual")
> abline(h=0)
## Plot the residuals versus temperature
> plot(data$y,lmfit$resid ,xlab="Temperature",ylab="Residual")
> abline(h=0)
Introduction to Splines: Linear Regression, Least Squares Fitting in R 18/52
Least Squares Fitting in RThe Fitted Line
Residuals vs. Day Residuals vs. Temperature
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
100 120 140 160 180
−10
−5
05
10
Day of Year
Res
idua
l
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
0 5 10 15 20
−10
−5
05
10
Temperature
Res
idua
l
Introduction to Splines: Linear Regression, Least Squares Fitting in R 19/52
Exercises
1. Montreal Temperature Data – April 1 to June 30, 1961File: Intro to splines\Exercises\montreal temp 1.RUse the provide code to fit the simple linear regression modelto the Montreal temperature data from the spring of 1961,plot the fitted line, and produce the residual plots.
2. Montreal Temperature Data – Jan. 1 to Dec. 31, 1961File: Intro to splines\Exercises\montreal temp 2.RRepeat exercise 1 with the data from all of 1961.
Introduction to Splines: Linear Regression, Least Squares Fitting in R 20/52
An Introduction to Bayesian Inference
1 Linear RegressionSimple Regression and the Least Squares MethodLeast Squares Fitting in RPolynomial Regression
Polynomial RegressionMotivation
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●
●
●●
●
●●●
●●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●
●
●●●●●
●●
●
●
●
●
●●
●
●●
●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
0 100 200 300
−20
−10
010
20
Montreal Temp. −− January 1 to December 31, 1961
Day of Year
Tem
pera
ture
Introduction to Splines: Linear Regression, Polynomial Regression 22/52
Polynomial RegressionMotivation
Residuals vs. Day Residuals vs. Temperature
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●
●
●●
●
●●●
●●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●
●
●●●●●
●●
●
●
●
●
●●
●
●●
●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
0 100 200 300
−20
−10
010
20
Day of Year
Res
idua
l
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●
●
●●
●●
●
●●
●
●●●
●●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●●
●●
●
●
●●
●●●●
●
●●
●●
●
●●
●●●
●●
●
●
●
●
●●
●
●●
●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
−20 −10 0 10 20
−20
−10
010
20
Temperature
Res
idua
l
Introduction to Splines: Linear Regression, Polynomial Regression 23/52
Polynomial RegressionPolynomials
DefinitionA polynomial of degree D is a function formed by linearcombinations of the powers of its argument up to D:
y = β0 + β1x + β2x2 + · · ·+ βDxD
Specific Polynomials
Linear y = β0 + β1x
Quadratic y = β0 + β1x + β2x2
Cubic y = β0 + β1x + β2x2 + β3x
3
Quartic y = β0 + β1x + β2x2 + β3x
3 + β4x4
Quintic y = β0 + β1x + β2x2 + β3x
3 + β4x4 + β5x
5
Introduction to Splines: Linear Regression, Polynomial Regression 24/52
Polynomial RegressionThe Design Matrix
DefinitionThe design matrix for a regression model with n observations andp predictors is the matrix with n rows and p columns such that thevalue of the j th predictor for the i th observation is located incolumn j of row i .
Design matrix for a polynomial of degree D
123...n
1 x1 x2
1 x31 · · · xD
1
1 x2 x22 x3
2 · · · xD2
1 x3 x23 x3
3 · · · xD3
...1 xn x2
n x3n · · · xD
n
Introduction to Splines: Linear Regression, Polynomial Regression 25/52
Polynomial Regression in RConstructing the Design Matrix – Quadratic
The design matrix for polynomial regression can be generated withthe function outer():
> D = 2> X = outer(data$x,1:D,"^")> X[1:5 ,]
[,1] [,2][1,] 1 1[2,] 2 4[3,] 3 9[4,] 4 16[5,] 5 25>
Note: we do not need to include the intercept column.
Introduction to Splines: Linear Regression, Polynomial Regression 26/52
Polynomial Regression in RLeast Squares Fitting – Quadratic
> lmfit = lm(y~X,data)> attributes(lmfit)$names[1] "coefficients" "residuals" ...
$class[1] "lm"
> lmfit$coefficients(Intercept) X1 X2
-23.715358962 0.413901580 -0.001014625
Introduction to Splines: Linear Regression, Polynomial Regression 27/52
Polynomial Regression in RFitted Model – Quadratic
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●
●
●●
●
●●●
●●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●
●
●●●●●
●●
●
●
●
●
●●
●
●●
●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
0 100 200 300
−20
−10
010
20
Montreal Temp. −− January 1 to December 31, 1961
Day of Year
Tem
pera
ture
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●●●
●
●●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●●●
●●
●
●
●●
●
●●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
0 100 200 300−
15−
10−
50
510
15
Day of Year
Res
idua
l
Introduction to Splines: Linear Regression, Polynomial Regression 28/52
Exercises
1. Montreal Temperature Data – Jan. 1 to Dec. 31, 1961File: Intro to splines\Exercises\montreal temp 3.RUse the provided code to fit polynomial regression models ofvarying degree to the data for all of 1961. Models of differentdegree are constructed by setting the variable D (e.g., D=2produces a quadratic model). What is the minimal degreerequired for the model to fit well?
2. Montreal Temperature Data – Jan. 1, 1961, to Dec. 31, 1962File: Intro to splines\Exercises\montreal temp 4.RRepeat this exercise using the data from both 1961 and 1962.
Introduction to Splines: Linear Regression, Polynomial Regression 29/52
An Introduction to Bayesian Inference
2 Smoothing SplinesSimple SplinesB-splinesOverfitting and Smoothness
An Introduction to Bayesian Inference
2 Smoothing SplinesSimple SplinesB-splinesOverfitting and Smoothness
SplinesMotivation
●●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●●●●
●
●●●
●●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●●●
●●
●●
●
●
●●●●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●●●●
●
●●●●
●
●
●
●
●●
●●●●●
●
●
●
●
●
●
●●●●
●●●●●
●
●●
●
●●
●
●
●
●●●●
●
●●
●●●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●●●
●
●●●●
●●●
●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●●
●●●
●
●
●●●●
●●●●
●●
●●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
0 200 400 600
−20
−10
010
20Montreal Temp. −− January 1 to December 31, 1962
Day of Year
Tem
pera
ture
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●●●
●
●●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●●●
●●
●
●
●●
●
●●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●●●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●
●●●
●
●●●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●●●●
●
●●
●●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●●●●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
0 200 400 600
−15
−10
−5
05
10
Day of Year
Res
idua
lHow is the temperature changing in the spring of 1962?
y =− 7.6− 8.3x − 0.3x2 − 5.2× 104x−3 + 4.4× 10−6x4
− 2.1× 10−8x5 + 6.0× 10−11x6 − 8.9× 10−14x7 + 5.5× 10−17x8
Introduction to Splines: Smoothing Splines, Simple Splines 32/52
SplinesA Linear Spline for the Montreal Temperature Data
●●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●●●●
●
●●●
●●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●●●
●●
●●
●
●
●●●●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●●●●
●
●●●●
●
●
●
●
●●
●●●●●
●
●
●
●
●
●
●●●●
●●●●●
●
●●
●
●●
●
●
●
●●●●
●
●●
●●●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●●●
●
●●●●
●●●
●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●●
●●●
●
●
●●●●
●●●●
●●
●●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
0 200 400 600
−20
−10
010
20
Montreal Temp. −− January 1 to December 31, 1962
Day of Year
Tem
pera
ture
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●●●
●
●●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●●●
●●
●
●
●●
●
●●●●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●●●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●
●●●
●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●●●●
●
●●
●●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●●●●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
0 200 400 600
−15
−10
−5
05
10
Day of Year
Res
idua
l
How is the temperature changing in the spring of 1962?
y = −144.5 + .3x
Introduction to Splines: Smoothing Splines, Simple Splines 33/52
SplinesLinear Splines
DefinitionA linear spline is a continuous function formed by connecting linearsegments. The points where the segments connect are called theknots of the spline.
Introduction to Splines: Smoothing Splines, Simple Splines 34/52
SplinesHigher Order Splines
DefinitionA spline of degree D is a function formed by connectingpolynomial segments of degree D so that:
I the function is continuous,
I the function has D − 1 continuous derivatives, and
I the Dth derivative is constant between knots.
Introduction to Splines: Smoothing Splines, Simple Splines 35/52
Simples SplinesThe Truncated Polynomials
DefinitionThe truncated polynomial of degree D associated with a knot ξk isthe function which is equal to 0 to the left of ξk and equal to(x − ξk)D to the right of ξk .
(x − ξk)D+ =
{0 x < ξk(x − ξk)D x ≥ ξk
The equation for a spline of degree D with K knots is:
y = β0 +D∑
d=1
βdxd +K∑
k=1
bk(x − ξk)D+
Introduction to Splines: Smoothing Splines, Simple Splines 36/52
Simple SplinesThe Design Matrix
The design matrix for a spline of degree D with K knots is the nby 1 + D + K matrix with entries:
1 x1 x21 · · · xD
1 (x1 − ξ1)D+ · · · (x1 − ξK )D
+
1 x2 x22 · · · xD
2 (x2 − ξ1)D+ · · · (x2 − ξK )D
+
1 x3 x23 · · · xD
3 (x3 − ξ1)D+ · · · (x3 − ξK )D
+...
1 xn x2n · · · xD
n (xn − ξ1)D+ · · · (xn − ξK )D
+
Introduction to Splines: Smoothing Splines, Simple Splines 37/52
Simple Splines in RThe Design Matrix
After defining the degree and the locations of the knots, the designmatrix can be generated with the functions outer and cbind:
> D = 3> K = 5> knots = 730 * (1:K)/(K+1)> X1 = outer(data$x,1:D,"^")> X2 = outer(data$x,knots ,">") *
outer(data$x,knots ,"-")^D> X = cbind(X1 ,X2)> round(X[c(1 ,150 ,300) ,1:5] ,1)
[,1] [,2] [,3] [,4] [,5][1,] 1 1 1 0.0 0[2,] 150 22500 3375000 22745.4 0[3,] 300 90000 27000000 5671495.4 181963>
Introduction to Splines: Smoothing Splines, Simple Splines 38/52
Simple Splines in RFitting the Spline Model
lmfit = lm(y~X,data=data)
Introduction to Splines: Smoothing Splines, Simple Splines 39/52
Simple Splines in RFitted Cubic Spline
●●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●●●●
●
●●●
●●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●●●
●●
●●
●
●
●●●●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●●●●
●
●●●●
●
●
●
●
●●
●●●●●
●
●
●
●
●
●
●●●●
●●●●●
●
●●
●
●●
●
●
●
●●●●
●
●●
●●●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●●●
●
●●●●
●●●
●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●●
●●●
●
●
●●●●
●●●●
●●
●●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
0 200 400 600
−20
−10
010
20
Montreal Temp. −− January 1 to December 31, 1962
Day of Year
Tem
pera
ture
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●●●
●
●●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●●●
●●
●
●
●●
●
●●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●●●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●
●●●
●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●●●●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
0 200 400 600−
15−
10−
50
510
Day of Year
Res
idua
l
Introduction to Splines: Smoothing Splines, Simple Splines 40/52
Exercises
1. Montreal Temperature Data – Jan. 1 to Dec. 31, 1961File: Intro to splines\Exercises\montreal temp 5.RUse the code provided to fit splines of varying degree and withdifferent numbers of knots to the data from 1961 and 1962.
Introduction to Splines: Smoothing Splines, Simple Splines 41/52
An Introduction to Bayesian Inference
2 Smoothing SplinesSimple SplinesB-splinesOverfitting and Smoothness
The B-Spline BasisTroubles with Truncated Polynomials
Splines computed from the truncated polynomials may benumerically unstable because:
I the values in the design matrix may be very large, and
I the columns of the design matrix may be highly correlated.
Introduction to Splines: Smoothing Splines, B-splines 43/52
The B-spline Basis in RGenerating the Design Matrix and Fitting the Model
The B-spline design matrix can be constructed via the function bsprovided by the splines library:
> library(splines)> D = 3> K = 5> knots = 730 * (1:K)/(K+1)> X = bs(data$x,knots=knots ,
degree=D,intercept=TRUE)> lmfit = lm(y~X-1,data=data)>
Introduction to Splines: Smoothing Splines, B-splines 44/52
The B-spline Basis in RFitted Cubic B-spline Model
●●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●●●●
●
●●●
●●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●●●
●●
●●
●
●
●●●●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●●●●
●
●●●●
●
●
●
●
●●
●●●●●
●
●
●
●
●
●
●●●●
●●●●●
●
●●
●
●●
●
●
●
●●●●
●
●●
●●●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●●●
●
●●●●
●●●
●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●●
●●●
●
●
●●●●
●●●●
●●
●●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
0 200 400 600
−20
−10
010
20
Montreal Temp. −− January 1, 1961, to December 31, 1962
Day of Year
Tem
pera
ture
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●●●
●
●●
●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●●●
●●
●
●
●●
●
●●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●●●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●
●●●
●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●●●●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
0 200 400 600−
15−
10−
50
510
Day of Year
Res
idua
l
Introduction to Splines: Smoothing Splines, B-splines 45/52
Exercises
1. Montreal Temperature Data – Jan. 1 to Dec. 31, 1961File: Intro to splines\Exercises\montreal temp 6.RFit B-splines to the data from 1961 and 1962 using the codein the file. Increase the number of knots to see how thisaffects the fit of the curve. What happens when the numberof knots is very large, say K = 50?
Introduction to Splines: Smoothing Splines, B-splines 46/52
An Introduction to Bayesian Inference
2 Smoothing SplinesSimple SplinesB-splinesOverfitting and Smoothness
Overfitting and SmoothnessMotivation
A cubic spline with 50 knots:
●●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●●●●
●
●●●
●●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●●●
●●
●●
●
●
●●●●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●●●●
●
●●●●
●
●
●
●
●●
●●●●●
●
●
●
●
●
●
●●●●
●●●●●
●
●●
●
●●
●
●
●
●●●●
●
●●
●●●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●●●
●
●●●●
●●●
●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●●
●●●
●
●
●●●●
●●●●
●●
●●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
0 200 400 600
−20
−10
010
20
Montreal Temp. −− January 1, 1961, to December 31, 1962
Day of Year
Tem
pera
ture
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●●●
●●
●●
●
●
●
●
●
●●
Introduction to Splines: Smoothing Splines, Overfitting and Smoothness 48/52
Overfitting and SmoothnessKnot Selection
Concept
The shape of a spline can be controlled by carefully choosing thenumber of knots and their exact locations in order to:
1. allow flexibility where the trend changes quickly, and
2. avoid overfitting where the trend changes little.
Challenge
Choosing the number of knots and their location is a very difficultproblem to solve.
Introduction to Splines: Smoothing Splines, Overfitting and Smoothness 49/52
Overfitting and SmoothnessPenalization
Concept
We can also balance overfitting and smoothness by controlling thesize of the spline coefficients.
Introduction to Splines: Smoothing Splines, Overfitting and Smoothness 50/52
Overfitting and SmoothnessPenalization for Truncated Polynomials
Penalization for the Linear Spline
I Consider the equation for each segment of the spline:
(0, ξ1) : y = β0 + β1 x(ξ1, ξ2) : y = (β0 − b1ξ1) + (β1 + b1) x(ξ2, ξ3) : y = (β0 − b1ξ1 − b2ξ2) + (β1 + b1 + b2) x
I The spline is smooth if b1, b2, . . . , bK are all close to 0.
Penalized Least Squares
PSS =n∑
i=1
(yi − yi )2 + λ
K∑k=1
b2k
Introduction to Splines: Smoothing Splines, Overfitting and Smoothness 51/52
Overfitting and SmoothnessPenalization for the B-spline Basis
Penalization for the B-spline
The spline is smooth if b1, b2, . . . , bK are all close to each other.(But not necessarily close to 0.)
Penalized Least Squares
PSS =n∑
i=1
(yi − yi )2 + λ
K∑k=3
((bk − bk−1)− (bk−1 − bk−2))2
Introduction to Splines: Smoothing Splines, Overfitting and Smoothness 52/52
Overfitting and SmoothnessA Penalized Cubic B-spline
A penalized cubic B-spline with 50 knots and λ = 5:
●●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●●●
●●●
●●●●●●●
●
●●●
●●
●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●●
●
●
●●●●●●●
●●
●●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●
●●●
●●●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●●●
●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●●●
●●
●●
●
●
●●●●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●●●
●
●●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●
●●●
●
●●●●
●
●●●●
●
●
●
●
●●
●●●●●
●
●
●
●
●
●
●●●●
●●●●●
●
●●
●
●●
●
●
●
●●●●
●
●●
●●●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●●●
●
●●●●
●●●
●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●●
●●●
●
●
●●●●
●●●●
●●
●●
●
●●
●●
●
●
●●
●
●
●
●●●●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
0 200 400 600
−20
−10
010
20
Montreal Temp. −− January 1, 1961, to December 31, 1962
Day of Year
Tem
pera
ture
●●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●●●●●
●
●
●
●
●
●
●
Introduction to Splines: Smoothing Splines, Overfitting and Smoothness 53/52
Exercises
1. Montreal Temperature Data – Jan. 1 to Dec. 31, 1961File: Intro to splines\Exercises\montreal temp 7.RFit penalized cubic B-splines to the Montreal temperaturedata for 1961 and 1962 using the provided code.
Introduction to Splines: Smoothing Splines, Overfitting and Smoothness 54/52