support vector machines arthur j. parzygnat
TRANSCRIPT
![Page 1: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/1.jpg)
Cupcakes versus muffinsSupport vector machines
Arthur J. Parzygnat
University of ConnecticutSIGMA
January 26, 2018
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 1 / 32
![Page 2: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/2.jpg)
Outline
1 Cupcakes and muffinsWhat’s the difference?
2 HyperplanesParametrizing hyperplanesMarginal hyperplanesThe definition of a support vector machine
3 Existence and uniqueness of SVMsEnlarging marginsExistence of SVMs
4 A simple example of an SVM
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 2 / 32
![Page 3: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/3.jpg)
Cupcakes and muffins What’s the difference?
What’s the difference?
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 3 / 32
![Page 4: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/4.jpg)
Hyperplanes Parametrizing hyperplanes
Parametrizing hyperplanes
Lemma 1
Let H ⊆ Rn be a hyperplane. Then there exists a vector ~w ∈ Rn \ {~0} anda real number c ∈ R such that H is the solution set of 〈~w , ~x〉 − c = 0.
Proof.
Let ~h ∈ H. Then H − ~h is an(n− 1)-dimensional subspace in Rn.Hence, (H − ~h)⊥ is aone-dimensional subspace spannedby some normalized vector u.Because span{u} is perpendicularto H, there exists an a ∈ R suchthat au ∈ H. Then, H is thesolution set of 〈u, ~x〉 − a = 0.
•H
~h
u
au
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 4 / 32
![Page 5: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/5.jpg)
Hyperplanes Marginal hyperplanes
Marginal planes
Note that a hat over a vector signifies that it is a unit vector (hasmagnitude 1). There are many ways to describe hyperplanes. It should notbe obvious why it is preferable to use the description given by a vector anda number. Furthermore, notice that there are redundancies with thisparametrization. For example, (λ~w , λc) and (~w , c) describe the samehyperplane for all λ ∈ R \ {0}.
Definition 2
Let ~w ∈ Rn \ {~0} and c ∈ R with associated plane H given by the solutionset of 〈~w , ~x〉 − c = 0. The marginal planes H+ and H− associated to(~w , c) are the solution sets to 〈~w , ~x〉 − c = 1 and 〈~w , ~x〉 − c = −1,respectively, i.e.
H± :={~x ∈ Rn : 〈~w , ~x〉 − c = ±1
}.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 5 / 32
![Page 6: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/6.jpg)
Hyperplanes Marginal hyperplanes
Marginal planes example
Example 3
In R2, if ~w =1
3
[31
]and c = 4, then
the marginal planes would look likethe figure on the right since they aredescribed by
H :1
3(3x + y)− 4 = 0
H+ :1
3(3x + y)− 4 = 1
H− :1
3(3x + y)− 4 = −1 •
HH− H+
~w
Notice that the vector
(c
‖~w‖2
)~w =
6
5
[31
]lies on the plane H.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 6 / 32
![Page 7: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/7.jpg)
Hyperplanes Marginal hyperplanes
The distance between marginal hyperplanes
Lemma 4
Let (~w , c) ∈ (Rn \ {~0})× R describe a hyperplane H. The perpendiculardistance between H and H± is 1
‖~w‖ .
Proof.
Let ~x+ ∈ H+ and ~x ∈ H. Theorthogonal distance between H and
H+ is given by⟨~x+ − ~x , ~w
‖~w‖
⟩=
1‖~w‖(〈~w , ~x+〉 − 〈~w , ~x〉
)=
1‖~w‖((1 + c)− c
)= 1‖~w‖ by the
definition of H and H+ in terms of(~w , c). A similar calculation holds forH−. •
HH− H+
~w
~x+
~x
~x+−~x
w‖~w‖
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 7 / 32
![Page 8: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/8.jpg)
Hyperplanes Marginal hyperplanes
The margin and margin width
Definition 5
Let (~w , c) ∈ (Rn \ {~0})× R describe a hyperplane H. The convex regionbetween H+ and H− is called the margin of (~w , c). The orthogonaldistance between H+ and H−, which is given by 2
‖~w‖ , is called the margin
width of (~w , c).
Even though (~w , c) can be scaled to (λ~w , λc) to give the same H for anyλ ∈ R \ {0}, notice that the marginal planes are different. This is becausethe margin width has scaled by a factor of 1
λ .
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 8 / 32
![Page 9: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/9.jpg)
Hyperplanes Marginal hyperplanes
Parametrizing hyperplanes and their margins
Lemma 6
An ordered pair of distinct parallel hyperplanes (H−,H+) in Rn determinea unique (~w , c) ∈
(Rn \ {~0}
)× R whose marginal planes agree with H−
and H+.
Proof.
Let ~x+ ∈ H+ and pick u ∈ (H+ − ~x+)⊥ suchthat if λu ∈ H− and µu ∈ H+, then λ < µ (i.e.u is perpendicular to H+ and points from H−to H+). Also, let ~x− ∈ H−. Since theorthogonal separation between the planes H+
and H− is 〈~x+ − ~x−, u〉, set ~w :=(
2〈~x+−~x−,u〉
)u
and c := 〈~u,~x++~x−〉〈~u,~x+−~x−〉 . Then (~w , c) has H+ & H−
as its marginal planes.•
H− H+u
~x+
~x−
~x+ − ~x−
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 9 / 32
![Page 10: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/10.jpg)
Hyperplanes Marginal hyperplanes
Parametrizing hyperplanes and their margins
Corollary 7
There is a 1-1 correspondence between the set of (ordered) pairs of parallelhyperplanes (the marginal hyperplanes) and the set
(Rn \ {~0}
)× R.
Henceforth, we will abuse notation a bit so that when we say “hyperplane,”we will always mean an element of
(Rn \ {~0}
)×R unless otherwise stated.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 10 / 32
![Page 11: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/11.jpg)
Hyperplanes The definition of a support vector machine
Separating hyperplanes
Definition 8
Let (X ,X+,X−) denote a non-empty set X of vectors in Rn that areseparated into the two (disjoint) non-empty sets X+ and X−. Such acollection of sets is called a training data set. A hyperplane H ⊆ Rn,described by (~w , c) ∈ (Rn \ {~0})× R, separates (X ,X+,X−) iff
〈~w , ~x+〉 − c > 0 & 〈~w , ~x−〉 − c < 0
for all ~x+ ∈ X+ and for all ~x− ∈ X−. In this case, H is said to be aseparating hyperplane for (X ,X+,X−). H marginally separates(X ,X+,X−) iff
〈~w , ~x+〉 − c ≥ 1 & 〈~w , ~x−〉 − c ≤ −1
for all ~x+ ∈ X+ and for all ~x− ∈ X−.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 11 / 32
![Page 12: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/12.jpg)
Hyperplanes The definition of a support vector machine
Support vector machine
Definition 9
Let (X ,X+,X−) denote a training data set in Rn. LetSX ⊆ (Rn \ {~0})× R denote the set of hyperplanes that marginallyseparate (X ,X+,X−). Let f : SX → R be the function defined by
(Rn \ {~0})× R 3 (~w , c) 7→ f (~w , c) :=2
‖~w‖,
i.e. the margin. A support vector machine (SVM) for (X ,X+,X−) is amaximum of f , i.e. an SVM is a marginally separating hyperplane(~w , c) ∈ (Rn \ {~0})× R such that 1
‖~w ′‖ ≤1‖~w‖ for every other marginally
separating hyperplane (~w ′, c ′) ∈ (Rn \ {~0})× R.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 12 / 32
![Page 13: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/13.jpg)
Hyperplanes The definition of a support vector machine
Support vector machine
Example 10
An SVM is a hyperplane that maximizes the margin
HH− H+
- --- -
- -
-
+
+ +
+
+ +
+
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 13 / 32
![Page 14: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/14.jpg)
Hyperplanes The definition of a support vector machine
The support vectors of a marginally separating hyperplane
How will we know when we have found an SVM? The key to determiningmarginally separating hyperplanes will be in the possible support vectors.Our goal will then be to maximize the margin over all possible supportvectors.
Definition 11
Let (X ,X+,X−) be a training data set and let (~w , c) be a marginallyseparating hyperplane for this set. The elements of X ∩ H± are calledsupport vectors for (~w , c). The set of support vectors is denoted by Hsupp
X .The notation Hsupp
X± := HsuppX ∩ H± will also be used to denote the set of
positive and negative support vectors.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 14 / 32
![Page 15: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/15.jpg)
Hyperplanes The definition of a support vector machine
Examples of support vectors
Example 12
Two examples of support vectors have been circled for two differentseparating marginal hyperplanes for the same training data set.
HH− H+
- --- -
- -
-
+
+ +
+
+ +
+HH− H+
- --- -
- -
-
+
+ +
+
+ +
+
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 15 / 32
![Page 16: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/16.jpg)
Existence and uniqueness of SVMs Enlarging margins
“Enlarging the margins” Lemma Statement
Lemma 13
Let (X ,X+,X−) be a training data set and let (~w , c) be a marginallyseparating hyperplane for this set. Then there exists a unique marginallyseparating hyperplane (~v , d) such that
v = w &2
‖~v‖= min
~x+∈X+~x−∈X−
〈~x+ − ~x−, w〉 .
In other words, if the marginal planes do not contain any of the trainingdata set, then the separating hyperplane can be translated and the marginwidth can be enlarged until the margin touches both positive and negativetraining data sets. We use the minimum orthogonal distance between thepositive and negative training data sets to define the new margin width.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 16 / 32
![Page 17: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/17.jpg)
Existence and uniqueness of SVMs Enlarging margins
“Enlarging the margins” Lemma Picture
H
K
H−
K−
H+
K+
- -
-
- -
- -
-
+
+ +
+
+ +
+
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 17 / 32
![Page 18: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/18.jpg)
Existence and uniqueness of SVMs Enlarging margins
“Enlarging the margins” Lemma Proof
Proof.
Notice that(c−1‖~w‖
)w ∈ H−,
(c‖~w‖
)w ∈ H, and
(c+1‖~w‖
)w ∈ H+ provide
explicit vectors in each of these three hyperplanes. Set m+ to be theremaining minimum orthogonal distance between H+ and X+ and set m−to be the remaining minimum orthogonal distance between H− and X−,namely
m± := min~x±∈X±
θ(~x±)
(〈~x±, w〉 −
(c ± 1
‖~w‖
)),
where θ : X → R is the function defined by
X 3 ~x 7→ θ(~x) :=
{+1 if ~x ∈ X+
−1 if ~x ∈ X−.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 18 / 32
![Page 19: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/19.jpg)
Existence and uniqueness of SVMs Enlarging margins
“Enlarging the margins” Lemma Proof
Proof.
Therefore, the planes K± containing the vectors(c±1‖~w‖ ±m±
)w that are
perpendicular to w intersect X± but do not contain points of X on theinterior of their margin. By Lemma 6, there exists a(~v , d) ∈
(Rn \ {~0}
)× R that describes these marginal separating
hyperplanes. In fact, they can be given explicitly by
~v :=
(2
2‖~w‖ + m+ + m−
)w
and
d :=
⟨~v ,
(c + 1
‖~w‖+ m+
)w
⟩− 1 =
2c + ‖~w‖(m+ −m−)
2 + ‖~w‖(m+ + m−).
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 19 / 32
![Page 20: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/20.jpg)
Existence and uniqueness of SVMs Existence of SVMs
Existence of SVMs
Theorem 14
Let (X ,X+,X−) be training data set for which there exists a separatinghyperplane for (X ,X+,X−). Then there exists a unique SVM for(X ,X+,X−).
Proof.
By Lemma 13, it suffices to maximize the margin function f on the subsetSsuppX ⊆ SX consisting of marginally separating hyperplanes that haveboth positive and negative support vectors, namely on
SsuppX :={
(~w , c) ∈ SX : H± ∩ X± 6= ∅}.
The goal is therefore to maximize the margin function, which is a functionof (~w , c), subject to the constraint
θ(~x)(〈~w , ~x〉 − c
)− 1 = 0 ∀ ~x ∈ SsuppX .
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 20 / 32
![Page 21: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/21.jpg)
Existence and uniqueness of SVMs Existence of SVMs
Proof of existence of SVMs
Proof.
Maximizing the margin function is equivalent to minimizing the function
(Rn \ {~0})× R 3 (~w , c)g7−→ 1
2‖~w‖2 −
∑~x∈X
α~x
(θ(~x)
(〈~w , ~x〉 − c
)− 1).
Here, α~x = 0 for all ~x ∈ X \ HsuppX (remember, Hsupp
X is the set of supportvectors for the given hyperplane) and α~x needs to be determined for all~x ∈ Hsupp
X . This condition guarantees that the function g equals 2f 2 when
restricted to SsuppX (but notice that it does not equal 2f 2 on the larger
domain SX of all marginally separating hyperplanes). The α~x are calledLagrange multipliers. The extrema of g occur at points (~w , c) for whichthe derivative of g vanishes with respect to these coordinates
∂g
∂~w
∣∣∣(~w ,c)
= 0 &∂g
∂c
∣∣∣(~w ,c)
= 0.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 21 / 32
![Page 22: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/22.jpg)
Existence and uniqueness of SVMs Existence of SVMs
Proof of existence of SVMs
Proof.
The first of these two equations gives
~w =∑~x∈X
α~xθ(~x)~x , (1)
while the second equation gives∑~x∈X
α~xθ(~x) = 0, (2)
which is a condition that the Lagrange multipliers have to satisfy.Plugging in these results back into the function g gives
g(~w , c) =∑~x∈X
α~x −1
2
∑~x ,~y∈X
α~xα~yθ(~x)θ(~y)〈~x , ~y〉
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 22 / 32
![Page 23: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/23.jpg)
Existence and uniqueness of SVMs Existence of SVMs
Proof of existence of SVMs
Proof.
Setting ∂g∂α~x
∣∣∣(~w ,c)
= 0 for each ~x ∈ HsuppX will give additional conditions
that the Lagrange multipliers have to satisfy, which are given by
θ(~x)(〈~w , ~x〉 − c
)= 1
for each ~x ∈ HsuppX . Plugging in the earlier result (1) for ~w into this
constraint gives ∑~y∈X
α~yθ(~y)〈~y , ~x〉 − θ(~x) = c (3)
for each ~x ∈ HsuppX . However, there is one subtle point, and that is that we
do not know what HsuppX is. Nevertheless, there is still an optimization
procedure left over, and it is based on the different possible choices ofHsuppX , i.e. for each pair of nonempty subsets X± ⊆ X±.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 23 / 32
![Page 24: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/24.jpg)
Existence and uniqueness of SVMs Existence of SVMs
Proof of existence of SVMs
Proof.
For each choice of HsuppX , one has the linear system∑
~y∈X
θ(~y)α~y = 0
{∑~y∈X
θ(~y)〈~y , ~x〉α~y − c = θ(~x)}~x∈X+∪X−
(4)
in the variables⋃
~x∈X+∪X−
{α~x} ∪ {c} obtained from equations (2) and (3).
This describes a linear system of |HsuppX |+ 1 equations in |Hsupp
X |+ 1variables. At least one of these systems is consistent because we haveassumed that the training data set can be separated. Hence, a solution tothe SVM problem exists since it is given by maximizing 1
‖~w‖ over the
(finite) set of pairs of nonempty subsets X± ⊆ X±.Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 24 / 32
![Page 25: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/25.jpg)
A simple example of an SVM
Example of an SVM
Consider the training data set given by
X+ :=
{~x+ :=
[01
]}& X− :=
{~x1− :=
[−1−2
], ~x2− :=
[2−1
]}.
•
-
-
+
The inner products are given by
〈~x+, ~x+〉 = 1
〈~x+, ~x1−〉 = −2
〈~x+, ~x2−〉 = −1
〈~x1−, ~x
1−〉 = 5
〈~x1−, ~x
2−〉 = 0
〈~x2−, ~x
2−〉 = 5
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 25 / 32
![Page 26: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/26.jpg)
A simple example of an SVM
Example: case i) H suppX = {~x+, ~x
1−, ~x
2−}
For this case, (4) reads
α~x+−α~x1
−−α~x1
−= 0
α~x++2α~x1
−+α~x1
−−c = 1
−2α~x+−5α~x1
−−c = −1
−α~x+−5α~x1
−−c = −1
and the solution to this linear system is
α~x+=
5
16, α~x1
−=
1
8, α~x2
−=
3
16, c = −1
4.
The associated vector ~w and margin are therefore given by
~w =5
16
[01
]− 1
8
[−1−2
]− 3
16
[2−1
]=
1
4
[−13
]=⇒ 2
‖~w‖=
8√10.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 26 / 32
![Page 27: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/27.jpg)
A simple example of an SVM
Example: case i) H suppX = {~x+, ~x
1−, ~x
2−}
The hyperplanes are then given by the solutions to the three equations
H :1
4(−x + 3y) +
1
4= 0
H+ :1
4(−x + 3y) +
1
4= 1
H− :1
4(−x + 3y) +
1
4= −1
•
H+
H
H−
-
-
+
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 27 / 32
![Page 28: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/28.jpg)
A simple example of an SVM
Example: case ii) H suppX = {~x+, ~x
1−}
For this case, α~x2−
= 0 and (4) reads
α~x+−α~x1
−= 0
α~x++2α~x1
−−c = 1
−2α~x+−5α~x1
−−c = −1
and the solution to this linear system is
α~x+=
1
5, α~x1
−=
1
5, c = −2
5.
The associated vector ~w and margin are therefore given by
~w =1
5
[01
]− 1
5
[−1−2
]=
1
5
[13
]=⇒ 2
‖~w‖=
10√10.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 28 / 32
![Page 29: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/29.jpg)
A simple example of an SVM
Example: case ii) H suppX = {~x+, ~x
1−}
The hyperplanes are then given by the solutions to the three equations
H :1
5(x + 3y) +
2
5= 0
H+ :1
5(x + 3y) +
2
5= 1
H− :1
5(x + 3y) +
2
5= −1
• H+
H
H−
-
-
+
Even though the margin is larger here, we must reject this solution becauseit is not a marginally separating hyperplane for the training data set.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 29 / 32
![Page 30: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/30.jpg)
A simple example of an SVM
Example: case iii) H suppX = {~x+, ~x
2−}
For this case, α~x1−
= 0 and (4) reads
α~x+−α~x1
−= 0
α~x++α~x1
−−c = 1
−α~x+−5α~x1
−−c = −1
and the solution to this linear system is
α~x+=
1
4, α~x2
−=
1
4, c = −1
2.
The associated vector ~w and margin are therefore given by
~w =1
4
[01
]− 1
4
[2−1
]=
1
2
[−11
]=⇒ 2
‖~w‖=
2√2
=2√
5√10.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 30 / 32
![Page 31: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/31.jpg)
A simple example of an SVM
Example: case iii) H suppX = {~x+, ~x
2−}
The hyperplanes are then given by the solutions to the three equations
H :1
5(x + 3y) +
2
5= 0
H+ :1
5(x + 3y) +
2
5= 1
H− :1
5(x + 3y) +
2
5= −1
•
H+
H
H−
-
-
+
We must also reject this solution because it is not a marginally separatinghyperplane for the training data set. Hence, our first (~w , c) is the SVM.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 31 / 32
![Page 32: Support vector machines Arthur J. Parzygnat](https://reader036.vdocuments.site/reader036/viewer/2022062406/62abd5f7246c8508e5015c4e/html5/thumbnails/32.jpg)
Thank you
Thank you!
Thanks for your attention!References:
1 Alice Zhao, Support Vector Machines: A Visual Explanation withSample Python Code (2017). Available athttps://youtu.be/N1vOgolbjSc.
2 Patrick Winston. 6.034 Artificial Intelligence “Lecture 16. Learning:Support Vector Machines.” Fall 2010. Massachusetts Institute ofTechnology: MIT OpenCourseWare, https://ocw.mit.edu.License: Creative Commons BY-NC-SA. Available athttps://youtu.be/_PwhiWxHK8o.
3 Benjamin Russo. Private Communication.4 Images of muffin from wikipedia
(https://en.wikipedia.org/wiki/Muffin)5 Images of cupcake from
https://littlecupcakes.com.au/product/red-velvet/.
Arthur J. Parzygnat (University of ConnecticutSIGMA) Cupcakes versus muffins January 26, 2018 32 / 32