dimension 1 - vanderbilt university
TRANSCRIPT
![Page 1: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/1.jpg)
![Page 2: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/2.jpg)
Dimension 1
Dim
ensi
on 2
Categorization
Category A
Category B
![Page 3: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/3.jpg)
P(Rj | i) =! jE j
!KEKK!R"
Now the evidences Ej are evidence for category membership rather than evidence for identity
![Page 4: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/4.jpg)
iBBiAA
iAA
EEE
iAP||
|)|(⋅+⋅
⋅=
ββ
β
What are some ways categories could be represented?
What gives rise to the evidence values?
![Page 5: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/5.jpg)
Dimension 1
Dim
ensi
on 2
Prototypes
EA|i is proportional to similarity to the prototype of category A
![Page 6: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/6.jpg)
Dimension 1
Dim
ensi
on 2
Ideals
EA|i is proportional to similarity to the ideal point of category A
![Page 7: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/7.jpg)
Dimension 1
Dim
ensi
on 2
Exemplars
EA|i is proportional to similarity to the experienced exemplars of category A
![Page 8: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/8.jpg)
Dimension 1
Dim
ensi
on 2
Decision Boundaries
EA|i is given by which side of the boundary exemplar i is on (boundary can be noise)
![Page 9: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/9.jpg)
Dimension 1
Dim
ensi
on 2
Rules
EA|i is given by which side of the rule boundary exemplar i is on (boundary can be noise)
![Page 10: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/10.jpg)
![Page 11: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/11.jpg)
Dimension 1
Dim
ensi
on 2
Exemplars
EA|i is proportional to similarity to the experienced exemplars of category A
![Page 12: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/12.jpg)
EA|i is proportional to similarity to the experienced exemplars of category A
![Page 13: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/13.jpg)
EA|i is proportional to similarity to the experienced exemplars of category A
- similarity to closest exemplar (nearest neighbor) - average similarity to exemplars - summed similarity to exemplars
∑
∑
=
∈
=
=
AN
jijiA
AjijiA
sE
sE
1|
|
![Page 14: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/14.jpg)
Generalized Context Model of Categorization
![Page 15: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/15.jpg)
Dimension 1
Dim
ensi
on 2
Exemplars
EA|i is proportional to similarity to the experienced exemplars of category A
i
![Page 16: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/16.jpg)
Dimension 1
Dim
ensi
on 2
Exemplars
EB|i is proportional to similarity to the experienced exemplars of category B
i
![Page 17: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/17.jpg)
iBBiAA
iAA
EEE
iAP||
|)|(⋅+⋅
⋅=
ββ
β
![Page 18: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/18.jpg)
∑=
=AN
jijiA sE
1|
iBBiAA
iAA
EEE
iAP||
|)|(⋅+⋅
⋅=
ββ
β
![Page 19: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/19.jpg)
∑=
=AN
jijiA sE
1|
iBBiAA
iAA
EEE
iAP||
|)|(⋅+⋅
⋅=
ββ
β
∑∑
∑
==
=
⋅+⋅
⋅
=BA
A
N
jijB
N
jijA
N
jijA
ss
siAP
11
1)|(ββ
β
![Page 20: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/20.jpg)
![Page 21: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/21.jpg)
∑∑
∑
==
=
⋅+⋅
⋅
=BA
A
N
jijB
N
jijA
N
jijA
ss
siAP
11
1)|(ββ
β
iKRK
K
ijjj s
siRP
⋅
⋅=∑∈
β
β)|(
QUESTION: Can the same similarities that explain identification confusions also explain categorization confusions?
![Page 22: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/22.jpg)
iKRK
K
ijjj s
siRP
⋅
⋅=∑∈
β
β)|(
QUESTION: Can the same similarities that explain identification confusions also explain categorization confusions?
![Page 23: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/23.jpg)
iKRK
K
ijjj s
siRP
⋅
⋅=∑∈
β
β)|(
QUESTION: Can the same similarities that explain identification confusions also explain categorization confusions?
Shepard, Hovland, & Jenkins (1961) tested this prediction by first having people learn to identify object with a unique name. They fitted SCM to the observed data (more on this later) to obtain values of biases and sij parameters. Next, they attempted to account for categorization data using those sij parameters using the categorization model.
![Page 24: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/24.jpg)
Shepard, Hovland, & Jenkins (1961) tested this prediction by first having people learn to identify object with a unique name. They fitted SCM to the observed data (more on this later) to obtain values of biases and sij parameters. Next, they attempted to account for categorization data using those sij parameters using the categorization model.
∑∑
∑
==
=
⋅+⋅
⋅
=BA
A
N
jijB
N
jijA
N
jijA
ss
siAP
11
1)|(ββ
β
![Page 25: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/25.jpg)
I IV II
III
V
VI
Shepard, Hovland, & Jenkins (1961)
< < < single- dimension XOR unique
identification
size
shap
e
![Page 26: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/26.jpg)
![Page 27: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/27.jpg)
Identification requires fine discriminations between similar stimuli … Categorization requires treating clearly discriminable stimuli as the same thing … So maybe it’s not surprising that the answer is no.
QUESTION: Can the same similarities that explain identification confusions also explain categorization confusions?
![Page 28: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/28.jpg)
Identification requires fine discriminations between similar stimuli … Categorization requires treating clearly discriminable stimuli as the same thing … So maybe it’s not surprising that the answer is no.
QUESTION: Can the same similarities that explain identification confusions also explain categorization confusions?
Not so fast …
![Page 29: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/29.jpg)
Generalized Context Model (GCM)
)exp( pij
dcij dces
pij ⋅−== ⋅−
( )rM
m
rmmmij jiwd
/1
1)(∑
=
−=
∑∑
∑
==
=
⋅+⋅
⋅
=BA
A
N
jijB
N
jijA
N
jijA
ss
siAP
11
1)|(ββ
β
![Page 30: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/30.jpg)
(i1,i2)
dimension 1 di
men
sion
2
(j1,j2)
(k1,k2)
( )rM
m
rmmmij jiwd
/1
1)(∑
=
−=
weighted general distance metric wm is weight on dimension m
![Page 31: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/31.jpg)
(i1,i2)
dimension 1
dim
ensi
on 2
(j1,j2)
(k1,k2)
( )rM
m
rmmmij jiwd
/1
1)(∑
=
−=
weighted general distance metric wm is weight on dimension m
(i1,i2)
dimension 1
dim
ensi
on 2
(j1,j2) (k1,k2)
w2à0
![Page 32: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/32.jpg)
(i1,i2)
dimension 1
dim
ensi
on 2
(j1,j2)
(k1,k2)
( )rM
m
rmmmij jiwd
/1
1)(∑
=
−=
weighted general distance metric wm is weight on dimension m
w1à0 (i1,i2)
dimension 1
dim
ensi
on 2
(j1,j2)
(k1,k2)
![Page 33: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/33.jpg)
I IV II
III
V
VI
Shepard, Hovland, & Jenkins (1961)
< < < single- dimension XOR unique
identification
size
shap
e
![Page 34: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/34.jpg)
![Page 35: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/35.jpg)
![Page 36: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/36.jpg)
Parameter Fitting Techniques
how do we find the values of model parameters that maximize the fit of a model to observed data?
![Page 37: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/37.jpg)
Measures of Fit
what do I mean by “fit”?
what are some ways you could measure fit?
![Page 38: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/38.jpg)
Pearson Correlation
SSE
RMSE
% Variance Accounted For
Likelihood (next week)
![Page 39: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/39.jpg)
Pearson Correlation
∑ ∑∑
−−
−−=
22,)()(
))((
prdobs
prdobsprdobs
prdobs
prdobsr
µµ
µµ
![Page 40: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/40.jpg)
Sum of Squared Error (SSE)
∑ −= 2, )( prdobsSSE prdobs
![Page 41: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/41.jpg)
Root Mean Squared Error (RMSE)
RMSEobs,prd =(obs! prd)2"
N
![Page 42: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/42.jpg)
% Variance Accounted For
null
modelnull
SSESSESSEVar −
=%
2)(∑ −=i
obsinull obsSSE µ
2)(∑ −=i
iimodel prdobsSSE
![Page 43: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/43.jpg)
Parameter Fitting Techniques
Minimize SSE Maximize r
Maximize %Var
next week we’ll talk about maximum likelihood after that we’ll talk about more complex measures
![Page 44: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/44.jpg)
One approach: CALCULUS
![Page 45: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/45.jpg)
DUMB MODEL (example)
obs prd dij sij sij 0 1.000 1 0.368 2 0.135 3 0.050 4 0.018 5 0.007
ijij ds βα +=
find parameters (α and β) that minimize SSE between obs sij and prd sij
![Page 46: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/46.jpg)
0
0
)(2
))((2
)(2
)1)((2
))((
)(
2
2
=∂
∂
=∂
∂
−−−=∂
∂
−−−=∂
∂
−−−=∂
∂
−−−=∂
∂
+−=
−=
∑
∑
∑
∑
∑
∑
β
α
βαβ
βαβ
βαα
βαα
βα
SSE
SSE
ddobsSSE
ddobsSSE
dobsSSE
dobsSSE
dobsSSE
prdobsSSE
kkkk
kkkk
kkk
kk
k
kkk
kkk
![Page 47: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/47.jpg)
0
0
)(2
))((2
)(2
)1)((2
))((
)(
2
2
=∂
∂
=∂
∂
−−−=∂
∂
−−−=∂
∂
−−−=∂
∂
−−−=∂
∂
+−=
−=
∑
∑
∑
∑
∑
∑
β
α
βαβ
βαβ
βαα
βαα
βα
SSE
SSE
ddobsSSE
ddobsSSE
dobsSSE
dobsSSE
dobsSSE
prdobsSSE
kkkk
kkkk
kkk
kk
k
kkk
kkk
![Page 48: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/48.jpg)
0
0
)(2
))((2
)(2
)1)((2
))((
)(
2
2
=∂
∂
=∂
∂
−−−=∂
∂
−−−=∂
∂
−−−=∂
∂
−−−=∂
∂
+−=
−=
∑
∑
∑
∑
∑
∑
β
α
βαβ
βαβ
βαα
βαα
βα
SSE
SSE
ddobsSSE
ddobsSSE
dobsSSE
dobsSSE
dobsSSE
prdobsSSE
kkkk
kkkk
kkk
kk
k
kkk
kkk
Why does this work?
![Page 49: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/49.jpg)
One approach: CALCULUS
nearly impossible in many situations intractable mathematical problem
![Page 50: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/50.jpg)
Other approaches: Search/Optimization Algorithms
require computer power
![Page 51: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/51.jpg)
but first, a quick aside …
![Page 52: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/52.jpg)
Illustration of one Common Modeling Technique (1) start with a model (2) set the free parameters to known values (3) generate predictions from the model (4) now treat those predictions as “data” (5) fit the model to the “observed data” (6) can you fit the model to the data (you should)? (7) do you get the same parameters back (depends)?
![Page 53: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/53.jpg)
Illustration of one Common Modeling Technique (1) start with a model (2) set the free parameters to known values (3) generate predictions from the model (4) now treat those predictions as “data” (5) fit the model to the “observed data” (6) can you fit the model to the data (you should)? (7) do you get the same parameters back (depends)? Why would you do this? (a) test that your model fitting program works right (b) check that the parameters are “identifiable (more later) (c) compare models based on their “flexibility”
image Model A can fit data generated by Model A and by Model B, but Model B can only really fit data generated by Model B then perhaps Model A is too flexible
![Page 54: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/54.jpg)
Generalized Context Model (GCM)
)exp( pij
dcij dces
pij ⋅−== ⋅−
( )rM
m
rmmmij jiwd
/1
1)(∑
=
−=
∑∑
∑
==
=
⋅+⋅
⋅
=BA
A
N
jijB
N
jijA
N
jijA
ss
siAP
11
1)|(ββ
β
![Page 55: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/55.jpg)
Categorization Task
unidimensional stimuli
e.g., proportion of white vs. black squares
MATLAB EXAMPLE
![Page 56: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/56.jpg)
Categorization Task
two-dimensional stimuli
MATLAB EXAMPLE
![Page 57: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/57.jpg)
how do we find the values of the model parameters that minimize SSE
(or maximize r, or maximize %var)
![Page 58: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/58.jpg)
GRID SEARCH
![Page 59: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/59.jpg)
parameter 1
para
met
er 2
calculate SSE at each combination of parameter 1 and parameter 2
![Page 60: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/60.jpg)
Matlab: See grid search for simple 1-parameter categorization model
![Page 61: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/61.jpg)
What might be some limitations of a grid search?
![Page 62: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/62.jpg)
the finer the grid search, the more evaluations you need to run
How fine of a grid search do you run? What if the best-fitting parameters are between the ones you’ve tried?
![Page 63: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/63.jpg)
How long does it take to run a grid search?
![Page 64: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/64.jpg)
evaluation time for one set of parameters
x # of evaluations
How long does it take to run a grid search?
![Page 65: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/65.jpg)
evaluation time for one set of parameters
x # of evaluations
How long does it take to run a grid search?
# steps of parm1 x # steps parm 2 x # step parm 3 x …
e.g., 1000 x 1000 x 1000 x 1000
![Page 66: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/66.jpg)
1 nanosecond (10-9) per evaluation x
1012 evaluations
103 seconds = 17 min
![Page 67: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/67.jpg)
100 seconds (102) per evaluation x
1012 evaluations
1014 seconds = 3 million years
![Page 68: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/68.jpg)
Hill-climbing Algorithms
simple hill-climbing Nedler-Meade Simplex
Hooke and Jeeves
“direct search methods”
![Page 69: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/69.jpg)
Enrico Fermi and Nicholas Metropolis used one of the first digital computers, the Los Alamos Maniac, to determine which values of certain theoretical parameters (phase shifts) best fit experimental data (scattering cross sections). They varied one theoretical parameter at a time by steps of the same magnitude, and when no such increase or decrease in any one parameter further improved the fit to the experimental data, they halved the step size and repeated the process until the steps were deemed sufficiently small. Their simple procedure was slow but sure, and several of us used it on the Avidac computer at the Argonne National Laboratory for adjusting six theoretical parameters to fit the pion-proton scattering data we had gathered using the University of Chicago synchrocyclotron [7].
W. C. Davidon, Variable Metric Method for Minimization, Tech. Rep. 5990, Argonne National Laboratory, Argonne, IL, 1959.
![Page 70: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/70.jpg)
Enrico Fermi and Nicholas Metropolis used one of the first digital computers, the Los Alamos Maniac, to determine which values of certain theoretical parameters (phase shifts) best fit experimental data (scattering cross sections). They varied one theoretical parameter at a time by steps of the same magnitude, and when no such increase or decrease in any one parameter further improved the fit to the experimental data, they halved the step size and repeated the process until the steps were deemed sufficiently small. Their simple procedure was slow but sure, and several of us used it on the Avidac computer at the Argonne National Laboratory for adjusting six theoretical parameters to fit the pion-proton scattering data we had gathered using the University of Chicago synchrocyclotron [7].
W. C. Davidon, Variable Metric Method for Minimization, Tech. Rep. 5990, Argonne National Laboratory, Argonne, IL, 1959.
these techniques only emerged 50 years ago (Calculus was invented 400 years ago)
![Page 71: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/71.jpg)
Simple Hill Climbing
DEMONSTRATE
![Page 72: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/72.jpg)
![Page 73: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/73.jpg)
Simple Hill Climbing
how many points do you need to evaluate with each step?
2 parameters
21 3
4
567
8
![Page 74: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/74.jpg)
Simple Hill Climbing
how many points do you need to evaluate with each step?
N parameters 5 parameters 10 parameters
3N-1 evaluations per step 242 evaluations per step 59049 evaluations per step
this ends up being inefficient because you can need to take 1000’s of steps
![Page 75: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/75.jpg)
“stupid” algorithm
SimpleHillClimb.m
![Page 76: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/76.jpg)
More sophisticated algorithms
Kolda, T.G., Lewis, R.M., & Torczon, V. (2003) Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45, 385-482.
![Page 77: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/77.jpg)
More sophisticated algorithms
DEMONSTRATE
e.g., Hooke and Jeeves a pattern search method
![Page 78: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/78.jpg)
More sophisticated algorithms
DEMONSTRATE
e.g., Nelder Meade Simplex fminsearch in MATLAB
http://en.wikipedia.org/wiki/Nelder-Mead_method
http://www.scholarpedia.org/article/Nelder-Mead_algorithm
![Page 79: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/79.jpg)
What is a simplex? 0 dimensions point 1 vertex 1 dimension line 2 vertices 2 dimensions triangle 3 vertices 3 dimensions tetrahedron 4 vertices 4 dimensions pentachoron 5 vertices . . . N dimensions N-simplex N+1 vertices
basically just a generalization of a triangle to N dimensions
![Page 80: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/80.jpg)
reflect
![Page 81: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/81.jpg)
expand
![Page 82: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/82.jpg)
contract
![Page 83: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/83.jpg)
shrink
![Page 84: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/84.jpg)
![Page 85: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/85.jpg)
Matlab examples
![Page 86: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/86.jpg)
More Matlab examples
![Page 87: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/87.jpg)
Medin & Schaffer (1978)
![Page 88: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/88.jpg)
P(A)obs P(A)prd A1 1 1 1 2 .77 .79 A2 1 2 1 2 .78 .83 A3 1 2 1 1 .83 .88 A4 1 1 2 1 .64 .65 A5 2 1 1 1 .61 .64 B1 1 1 2 2 .39 .45 B2 2 1 1 2 .41 .44 B3 2 2 2 1 .21 .23 B4 2 2 2 2 .15 .16 T1 1 2 2 1 .56 .62 T2 1 2 2 2 .41 .47 T3 1 1 1 1 .82 .85 T4 2 2 1 2 .40 .45 T5 2 1 2 1 .32 .34 T6 2 2 1 1 .53 .61 T7 2 1 2 2 .20 .22
Medin & Schaffer ‘78
![Page 89: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/89.jpg)
P(A)obs P(A)prdA1 1 1 1 2 .77 .79A2 1 2 1 2 .78 .83A3 1 2 1 1 .83 .88A4 1 1 2 1 .64 .65A5 2 1 1 1 .61 .64B1 1 1 2 2 .39 .45B2 2 1 1 2 .41 .44B3 2 2 2 1 .21 .23B4 2 2 2 2 .15 .16T1 1 2 2 1 .56 .62T2 1 2 2 2 .41 .47T3 1 1 1 1 .82 .85T4 2 2 1 2 .40 .45T5 2 1 2 1 .32 .34T6 2 2 1 1 .53 .61T7 2 1 2 2 .20 .22
)exp( pij
dcij dces
pij ⋅−== ⋅−
( )rM
m
rmmmij jiwd
/1
1)(∑
=
−=
∑∑
∑
==
=
⋅+⋅
⋅
=BA
A
N
jijB
N
jijA
N
jijA
ss
siAP
11
1)|(ββ
β
w1w2w3w4c
P(A)obs P(A)prdA1 1 1 1 2 .77 .79A2 1 2 1 2 .78 .83A3 1 2 1 1 .83 .88A4 1 1 2 1 .64 .65A5 2 1 1 1 .61 .64B1 1 1 2 2 .39 .45B2 2 1 1 2 .41 .44B3 2 2 2 1 .21 .23B4 2 2 2 2 .15 .16T1 1 2 2 1 .56 .62T2 1 2 2 2 .41 .47T3 1 1 1 1 .82 .85T4 2 2 1 2 .40 .45T5 2 1 2 1 .32 .34T6 2 2 1 1 .53 .61T7 2 1 2 2 .20 .22
SSE
![Page 90: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/90.jpg)
Gradient-Based Techniques when you can calculate
(or approximate) derivatives
![Page 91: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/91.jpg)
Simulated Annealing (generalization of Metropolis algorithm)
with noisy objective functions and with discrete parameter values
Genetic Search Algorithms with discrete parameter values
![Page 92: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/92.jpg)
possible project:
explore different parameter search routines to see which
best recovers parameters and does it most quickly
![Page 93: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/93.jpg)
![Page 94: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/94.jpg)
Homework Assignment
fit SCM fit GCM
partly using code we used in class today and code from last week’s assignment
I encourage people to work together
conceptually, but each person should do their own programming.
![Page 95: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/95.jpg)
![Page 96: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/96.jpg)
Problems of local minima
importance of multiple starting positions
![Page 97: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/97.jpg)
![Page 98: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/98.jpg)
Genetic Algorithms and Simulated Annealing may solve these problems
![Page 99: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/99.jpg)
Simulated Annealing always accept the new candidate parameter vector if it gives a better fit but also accept a new candidate parameter vector with probability P if it gives a WORSE fit e.g., P=exp(-Δfit/T) Δfit is the decrease in fit between current and candidate T is the “temperature”, which decreases according to a schedule as Δfità0 Pà1 T starts at ∞, so P starts at 1 (completely random) T goes to 0, so P goes to 0 (pure hill climbing) depending on cooling schedule, simulated annealing can take orders of magnitude longer than a basic hill climbing algorithm
![Page 100: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/100.jpg)
Genetic Algorithms multiple candidate parameter vectors are recombined or mutated and only some offspring are retained, akin to natural selection
![Page 101: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/101.jpg)
Problems of local minima
importance of multiple starting positions
how do you know when you’ve tested
enough starting points?
![Page 102: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/102.jpg)
what starting points do you pick? (1) based on “experience” with the model (2) based on an initial coarse parameter search
followed by a fine parameter search (3) an initial “random search” first, like
simulated annealing or genetic algorithms
![Page 103: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/103.jpg)
how many starting points? (1) do many starting points converge on the
same optimal parameter values (2) need to consider the amount of time it
takes to do a search from each starting point (3) if the model fits “everything” you’re okay
but it’s harder to know that a model really blows it
![Page 104: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/104.jpg)
![Page 105: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/105.jpg)
How to use the programs
![Page 106: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/106.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
passing a function as a parameter
![Page 107: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/107.jpg)
parInit = [1.6 -1.6]; parInc = [0.1 0.1]; parLow = [-4 -4]; parHigh = [ 4 4]; [HOOK_fit,HOOK_pos,HOOK_path] = … hook('mymodel',parInit,parLow,parHigh,
parInc,parInc/10);
passing the name of a function hook uses eval() MATLAB function
![Page 108: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/108.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
fminsearch()
P(A)obs P(A)prdA1 1 1 1 2 .77 .79A2 1 2 1 2 .78 .83A3 1 2 1 1 .83 .88A4 1 1 2 1 .64 .65A5 2 1 1 1 .61 .64B1 1 1 2 2 .39 .45B2 2 1 1 2 .41 .44B3 2 2 2 1 .21 .23B4 2 2 2 2 .15 .16T1 1 2 2 1 .56 .62T2 1 2 2 2 .41 .47T3 1 1 1 1 .82 .85T4 2 2 1 2 .40 .45T5 2 1 2 1 .32 .34T6 2 2 1 1 .53 .61T7 2 1 2 2 .20 .22
)exp( pij
dcij dces
pij ⋅−== ⋅−
( )rM
m
rmmmij jiwd
/1
1)(∑
=
−=
∑∑
∑
==
=
⋅+⋅
⋅
=BA
A
N
jijB
N
jijA
N
jijA
ss
siAP
11
1)|(ββ
β
P(A)obs P(A)prdA1 1 1 1 2 .77 .79A2 1 2 1 2 .78 .83A3 1 2 1 1 .83 .88A4 1 1 2 1 .64 .65A5 2 1 1 1 .61 .64B1 1 1 2 2 .39 .45B2 2 1 1 2 .41 .44B3 2 2 2 1 .21 .23B4 2 2 2 2 .15 .16T1 1 2 2 1 .56 .62T2 1 2 2 2 .41 .47T3 1 1 1 1 .82 .85T4 2 2 1 2 .40 .45T5 2 1 2 1 .32 .34T6 2 2 1 1 .53 .61T7 2 1 2 2 .20 .22
params fit
change params to try to decrease fit
mymodel()
![Page 109: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/109.jpg)
Some things to consider when using these tools: • do you have continuous vs. discrete parameters?
- discrete parameters may require grid search • need for multiple starting points because of local minima
- did you use enough starting points? - do various starting points converge? - how long does each parameter search take?
• where to place the starting points to use - based on experience with the model - preliminary exploration of the parameter space
• has the maximum number of iterations been reached? - MaxIter and MaxFunEvals in MATLAB - should only be reached if a parameter is going to ∞
![Page 110: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/110.jpg)
Some things to consider when using these tools: • what is the initial step size in the search
- consider a large step size in step 1 - smaller step size in step 2 - does the algorithm decrease the step size? - what is the step size for each parameter?
• what is the range of valid values for each parameter - does the search algorithm set min and max values?
![Page 111: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/111.jpg)
parInit = [1.6 -1.6]; parInc = [0.1 0.1]; parLow = [-4 -4]; parHigh = [ 4 4]; [HOOK_fit,HOOK_pos,HOOK_path] = … hook('mymodel',parInit,parLow,parHigh,
parInc,parInc/10);
Hook and Jeeves lets you specify the step size (parInc) separately for each parameter
![Page 112: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/112.jpg)
parInit = [1.6 -1.6]; parInc = [0.1 0.1]; parLow = [-4 -4]; parHigh = [ 4 4]; [HOOK_fit,HOOK_pos,HOOK_path] = … hook('mymodel',parInit,parLow,parHigh,
parInc,parInc/10);
Hook and Jeeves lets you specify the step size (parInc) separately for each parameter and lets you specify the min (parLow) and max (parHigh) separately for each parameter
![Page 113: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/113.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
MATLAB’s fminsearch (Simplex) does not all parameters are allowed to range between -∞ and +∞ ONE SOLUTION: Use fminsearchbnd on MATLAB central file exchange http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=8277&objectType=file
![Page 114: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/114.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
or program the constraints yourself function fit = mymodel (params) p1 = params(1); p2 = params(2); p3 = params(3); . . .
fit = sse;
![Page 115: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/115.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
or program the constraints yourself function fit = mymodel (params) p1 = params(1); what if this can only go -∞ and +∞ ? p2 = params(2); and this can go only between 1 and +∞ ? p3 = params(3); and this can only go between 1 and 3 ? . . .
fit = sse;
![Page 116: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/116.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
or program the constraints yourself function fit = mymodel (params) p1 = params(1); between -∞ and +∞ p2 = 1+params(2)^2; between 1 and +∞ p3 = 1+3*(sin(params(3))+1)/2; between 1 and 4 . . .
fit = sse;
![Page 117: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/117.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
or program the constraints yourself function fit = mymodel (params) p1 = params(1); between -∞ and +∞ p2 = LOW+params(2)^2; between LOW and +∞ p3 = HIGH-params(3)^2; between -∞ and HIGH p4 = LOW+(HIGH-LOW)*(sin(params(4))+1)/2;
between LOW and HIGH . . .
fit = sse;
![Page 118: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/118.jpg)
parInit = [3 2]; options = optimset('display', 'iter', 'MaxIter', 500); [bestx,fval] = fminsearch(@mymodel,parInit,options);
MATLAB’s fminsearch (Simplex) does not let you set the step size (for any parameters) may use an initial step size that is proportional to the value of the parameter, but this cannot be set by the user it‘s probably okay
there are some other search algorithms that assume the same step size for every parameter (e.g., subplex)
![Page 119: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/119.jpg)
![Page 120: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/120.jpg)
Some options the programs give you …
Max Iterations check that Max Iterations is never hit set to a big number sometimes searches can go off to infinity
![Page 121: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/121.jpg)
Some options the programs give you …
Step Size / Min Step Size rule of thumb : 1/100 of expected parameter value
NOTE: some programs use the same step size for every parameter; you should rescale the parameter value within the model routine
One approach is to do an initial search with a large step size just to find a reasonable set of starting points, then switch to a smaller step
![Page 122: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/122.jpg)
Some options the programs give you …
Step Size / Min Step Size what if step size is way to big?
![Page 123: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/123.jpg)
Some options the programs give you …
Step Size / Min Step Size what if step size is way too big?
starting point
minimum?
![Page 124: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/124.jpg)
Some options the programs give you …
Step Size / Min Step Size what if step size is way too big?
noisy objective functions – with Monte Carlo simulations
![Page 125: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/125.jpg)
![Page 126: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/126.jpg)
More sophisticated algorithms
combining hill climbing with grid search when some params are continuous and some are discrete
![Page 127: Dimension 1 - Vanderbilt University](https://reader033.vdocuments.site/reader033/viewer/2022052313/6289dbf70cca941810009c63/html5/thumbnails/127.jpg)