course #4 a very short introduction to ... - jerome...

34
1 Course #4 - A (very) short introduction to proximal algorithms J.Bobin - [email protected] MVA - Analyse de données parcimonieuses en astrophysique

Upload: others

Post on 25-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

1

Course #4 - A (very) short introduction to proximal algorithms

J.Bobin - [email protected] - Analyse de données parcimonieuses en astrophysique

Page 2: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Solving inverse problems

More generally, we will focus on linear inverse problems where :

b = Ax+ n

data, observations, etc. observation operator

signal to be retrieved

noise, model imperfections, etc

This models many inverse problems arising in physics :

- Denoising (A is the identity operator)- Deconvolution (A is the convolution kernel) This course- Inpainting/missing data interpolation (A is a binary mask) Course #4- Tomographic reconstruction (A is the partial Radon transform)- Radio-interferometric reconstruction (A is the partial Fourier transform) Course #5- Compressed sensing Course #5- Blind source separation Course #6-8

Page 3: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Solving inverse problems

Let’s assume x is sparse is some orthogonal basis: ↵ = �x

data fidelity term(measures how well the model fits the data)

sparsity-enforcing penalty

Examples of penalty terms:

k↵k`1 =X

i

|↵[i]|

P(↵) = k↵k`1The 0-norm counts the number of nonzero elements

P(↵) = k↵k`0

x = Argminx=�↵

P(↵) + kb��↵k22

Page 4: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Solving inverse problems

Computing the solution to an inverse problems also boils down to solving a minimization of the form:

x = Argminx

g(x)

x = Argminx

�k�Txk`p +

1

2kb�Hxk22

Example: penalized least-square estimator, etc.

x = Argminx

f(x) + g(x)

Or more generally,

Example: least-square estimator, maximum likelihood estimator, etc.

x = Argmin

x

X

i

x

i

� b

i

log(x

i

)

Page 5: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Let’s warm up with a simple case

Let’s consider the following simple case:

x = Argminx

g(x)

where g verifies the following properties:

- It is convex : 8x, y 2 Domg,↵ 2 [0, 1]; g(↵x+ (1� ↵)y) ↵g(x) + (1� ↵)g(y)

- It is differentiable : rg is defined on Domg

- Its gradient is Lipschitz: 8x, y 2 Domg; krg(x)�rg(y)k Lkx� yk

Example:g(x) = kb�Hxk2`2 rg(x) = 2H?(Hx� b) L = 2kH?Hk2

Page 6: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Gradient descent

In that case, having access to first-order information about g, the most straightforward/simplest first-order algorithm is the gradient descent algorithm:

x

(t+1) = x

(t) � �rg(x(t))

g(x) = c

x

(0)

x

(1)

x

(2)

Page 7: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

A more complex problem

Let’s reconsider the following L1-penalized least-square problem:

x = Argminx

�k�Txk`1 +

1

2kb�Hxk22

x = Argminx

f(x) + g(x)

convex and differentiablewith L-Lipschitz gradientconvex but not differentiable

Page 8: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Subgradient

A more precise description of f(x) requires defining the subgradient of a convex function:

@f(x) = {u 2 Domf ; 8y 2 Domf , f(x) + hy � x, ui f(y)}

For example, let’s go back to the L1-norm:

f(x) = kxk`1for x > 0, @f(x) = 1

x < 0, @f(x) = �1

x = 0, @f(x) = [�1, 1]

x

Page 9: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Proximal operator

This now allows to define the key element of proximal calculus: the proximal operator of a function f.

proxf (x) = Argminv f(v) +

1

2

kx� vk2`2

Domf

f(x) = c

Page 10: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Proximal operator, properties

Some useful properties (among others) :

i) translation :

ii) scaling :

iii) reflection :

iv) conjugation :

v) change of basis :

h(x) = f(x� z)

h(x) = f(x/�) proxh(x) = � prox1/�2f (x/�)

proxh(x) = z + proxf (x� z)

h(x) = f(�x) proxh(x) = �proxf (�x)

h(x) = f

?(x) = max

zhz, xi � f(z)

proxh(x) = x� proxf (x)

h(x) = f(�Tx) proxh(x) = � proxf (�

Tx)

Page 11: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Proximal operator, examples

The indicator function of a convex set K:

◆K(x) =

⇢0 if x 2 K,

+1 otherwise

prox◆K (x) = PK(x)

It is the orthogonal projector onto K !

Example of the non-negative orthant:

prox◆K (x) =

⇢x if x � 0,

0 otherwise

K = {u; u � 0}

The L2 norm:

f(x) = �kxk2`2 proxf (x) =1

1 + 2�

x

Page 12: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Proximal operator, examples

The L1 norm:f(x) = �kxk`1

proxf (x) = Argminu �kuk`1 +1

2

kx� uk2`2

By definition of the proximal operator:

soft-thresholding operator

S�

���

The Poisson log-likelihood:

f(x) = �k log(x) + x

proxf (x) =1

2

⇣x� 1 +

p|x� 1|2 + 4k

Page 13: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Forward-backward splitting algorithm

Let’s go back to our minimization problem:

x = Argminx

f(x) + g(x)

convex and differentiablewith L-Lipschitz gradientconvex but not differentiable

Then it has been shown that the following iterative scheme solves the problem:

x

(t+1)= prox�f

⇣x

(t) � �rg(x

(t))

prox gradient descent on g

� 2]0, 1/L[

Page 14: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Forward-backward splitting algorithm

Example:

g(x) = c

x

(1)

convex set K

minx2K

g(x)

x

(2)x

(3)

Page 15: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: denoising with redundant representations

x = Argminx=�↵

�k↵k`1 +

1

2kb��↵k2

`2

We want to solve a denoising problem by imposing sparsity in a redundant, non-orthogonal transform (undecimated wavelets, curvelets, ridgelets, etc …):

convex and differentiablewith 1-Lipschitz gradientconvex but not differentiable

↵(t+1) = S��

⇣↵(t) + ��T (b��↵))

⌘The forward-backward algorithm then reads:

Page 16: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: denoising with redundant representations

Page 17: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: deconvolution

We want to solve a deconvolution problem by imposing sparsity in some transform :

convex and differentiablewith L-Lipschitz gradientconvex but not differentiable

The forward-backward algorithm then reads:

x = Argminx=�↵

�k↵k`1 +

1

2kb�H�↵k2

`2

↵(t+1) = S��

⇣↵(t) + ��THT (b�H�↵))

⌘� <

1

kHk22

Page 18: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: deconvolution1066 STARCK, PANTIN, & MURTAGH

2002 PASP, 114:1051–1069

Fig. 9a Fig. 9b

Fig. 9c

Fig. 9.—(a) b Pictoris raw data; (b) filtered image; (c) deconvolved image.

and the Fourier domain:

X(x, y) if (x, y) ! D,P (X(x, y)) p (49)Cs {0 otherwise;

ˆI(u, v) p O(u, v) if (u, v) ! Q,ˆP (X(u, v)) pC ˆf {X(u, v) otherwise.

The projection operator replaces by zero all pixel valuesPCsthat are not in the spatial support defined by , and replacesD PCfall frequencies in the Fourier domain Q by the frequencies ofthe object O. The Gerchberg algorithm is as follows:

1. Compute p inverse Fourier transform of , and set0˜ ˆO I.i p 0

2. Compute .i˜X p P (O )1 Cs3. Compute p Fourier transform of .X X1 1

4. Compute .ˆ ˆX p P (X )2 C 1f

5. Compute p inverse Fourier transform of .ˆX X2 2

6. Compute .i!1˜ ˆO p P (X )C 2s

7. Set , , and go to 2.i!1˜X p O i p i! 11

The algorithm consists just of forcing iteratively the solutionto be zero outside the spatial domain and equal to the ob-Dserved visibilities inside the Fourier domain Q. It has been

1066 STARCK, PANTIN, & MURTAGH

2002 PASP, 114:1051–1069

Fig. 9a Fig. 9b

Fig. 9c

Fig. 9.—(a) b Pictoris raw data; (b) filtered image; (c) deconvolved image.

and the Fourier domain:

X(x, y) if (x, y) ! D,P (X(x, y)) p (49)Cs {0 otherwise;

ˆI(u, v) p O(u, v) if (u, v) ! Q,ˆP (X(u, v)) pC ˆf {X(u, v) otherwise.

The projection operator replaces by zero all pixel valuesPCsthat are not in the spatial support defined by , and replacesD PCfall frequencies in the Fourier domain Q by the frequencies ofthe object O. The Gerchberg algorithm is as follows:

1. Compute p inverse Fourier transform of , and set0˜ ˆO I.i p 0

2. Compute .i˜X p P (O )1 Cs3. Compute p Fourier transform of .X X1 1

4. Compute .ˆ ˆX p P (X )2 C 1f

5. Compute p inverse Fourier transform of .ˆX X2 2

6. Compute .i!1˜ ˆO p P (X )C 2s

7. Set , , and go to 2.i!1˜X p O i p i! 11

The algorithm consists just of forcing iteratively the solutionto be zero outside the spatial domain and equal to the ob-Dserved visibilities inside the Fourier domain Q. It has been

1066 STARCK, PANTIN, & MURTAGH

2002 PASP, 114:1051–1069

Fig. 9a Fig. 9b

Fig. 9c

Fig. 9.—(a) b Pictoris raw data; (b) filtered image; (c) deconvolved image.

and the Fourier domain:

X(x, y) if (x, y) ! D,P (X(x, y)) p (49)Cs {0 otherwise;

ˆI(u, v) p O(u, v) if (u, v) ! Q,ˆP (X(u, v)) pC ˆf {X(u, v) otherwise.

The projection operator replaces by zero all pixel valuesPCsthat are not in the spatial support defined by , and replacesD PCfall frequencies in the Fourier domain Q by the frequencies ofthe object O. The Gerchberg algorithm is as follows:

1. Compute p inverse Fourier transform of , and set0˜ ˆO I.i p 0

2. Compute .i˜X p P (O )1 Cs3. Compute p Fourier transform of .X X1 1

4. Compute .ˆ ˆX p P (X )2 C 1f

5. Compute p inverse Fourier transform of .ˆX X2 2

6. Compute .i!1˜ ˆO p P (X )C 2s

7. Set , , and go to 2.i!1˜X p O i p i! 11

The algorithm consists just of forcing iteratively the solutionto be zero outside the spatial domain and equal to the ob-Dserved visibilities inside the Fourier domain Q. It has been

Page 19: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: Inpainting

Inpainting problems arise when one wants to recover an image from incomplete measurements:

90% of the pixels are missing

b = M� x+ n

Entry-wise multiplication (Hadamard product)binary mask

Page 20: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: Inpainting

Inpainting has been tackled by solving a L1-penalized least-square problem of the form:

convex and differentiablewith 1-Lipschitz gradientconvex but not differentiable

The forward-backward algorithm then reads:

x = Argminx=�↵

�k↵k`1 +

1

2kb�M�↵k2

`2

mask recast as a diagonal matrix

↵(t+1)= prox�f

⇣↵(t)

+ ��T(b�M�↵))

Page 21: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: Inpainting

𝚽 = [Curvelets, Local DCT]

Page 22: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: convergence

The forward-backward algorithm is converges to the minimum of f + g at the following rate:

f(x(t))� f(x?) Lkx(0) � x

?k2t

which is called a “sublinear” rate of convergence.

The approximate number of iteration to reach a certain level of precision is ϵ:

t✏ =

⇠Lkx(0) � x

?k2✏

Remark: it is important to notice that a more precise convergence study would reveal that the speed of convergence also depends on the spectrum of the operator H.

Page 23: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: refinement with multi-step techniques

The FBS can be sped up by using multi-step techniques, which further account on information about the previous estimates:

x

(t+1)= prox�f

⇣x

(t) � �rg(x

(t))

Only depends on x(t)

From the seminal work of Nesterov, first in the early 80s, and later around 2007. The accelerated FBS is defined as follows:

(0) ⌫1 = 1, x(1) = x0; y(1) = x0

(1) x

(t)= prox�f

⇣y

(t) � �rg(y

(t))

(2) ⌫t+1 =1 +

p1 + 4⌫2t2

(3) y(t+1) = x

(t) +⌫t � 1

⌫t+1(x(t) � x

(t�1))

averaging of previous iterates

Page 24: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

FBS: refinement with multi-step techniques

In the case of the L1-penalized least-square algorithm:

Courtesy of Beck/Teboulle, 2009

f(x(t))� f(x?) 2Lkx(0) � x

?k(t+ 1)2

t✏ =

&r2Lkx(0) � x

?k2✏

� 1

'

Page 25: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Primal-dual algorithms

Things get slightly more complicated when we want to minimize a problem of the form:

x = Argminx

f(x) + g(x)

convex but not differentiable convex but not differentiable

Example 1: sparsity and quadratic constraint

Example 2: sparsity and impulsive noise removal

x = Argminx=�↵

�k↵k`1 + kb��↵k

`1

x = Argminx=�↵

k↵k`1 s.t. kb��↵k

`2 ✏

Page 26: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Primal-dual algorithms

We will more specifically focus on minimization problems of the form:

minx

f(x) + g(Ax)

which includes most linear inverse problems. We further assume that both f and g are “proximable”.

The main idea consists in splitting the application of the proximal operators of each of the functions. For that purpose, one has to resort to the Fenchel dual or convex conjugate:

g(y) = max

yhy, xi � g

?(y)

The previous problem can be recast as:

min

x

max

y

hy,Axi � g

?

(y) + f(x)

which turns out to be a saddle point problem

Page 27: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Primal-dual algorithms

Convergence to a saddle-point of this problem,min

x

max

y

hy,Axi � g

?

(y) + f(x)

can be done by using the following iterative procedure:

(1) y

(t+1)= max

yhy,Axi � g

?(y)� ⌧

2

ky � y

(t)k2`2

(3) x = x

(t+1) + ✓(x(t+1) � x

(t))

✓ 2 [0, 1]with ⌧�kAk2 < 1

(2)x(t+1) = minx

< y

(t+1), Ax > +f(x) +

2kx� x

(t)k2`2

Page 28: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Primal-dual algorithms

which eventually reads:

It alternates proxf and proxg

(3) x = x

(t+1) + ✓(x(t+1) � x

(t))

✓ 2 [0, 1]with ⌧�kAk2 < 1

(2)x

(t+1)= prox

1� f

⇣x

(t) � �A

Ty

(t+1)⌘

(1) y

(t+1)= prox

1⌧ g?

⇣y

(t)+ ⌧Ax

Page 29: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Example: point sources removal

Page 30: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Example

In this context, each channel is made three components:

x = x1 + x2 + x3 + n

backgroundextended sourcespoint sources

minx12K,x2,x32B

�k�T

x2k`1 + �kFT

x3k`1 s.t. kb�Hx1 �Mx2 � x3k`2 ✏

This has been tackled by solving, using a primal-dual proximal algorithm:

wavelets harmonic basis PSF point sources mask

B set of band-limited signals

K non-negative orthant

Page 31: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Example

Here is a challenging example:

b x1

Page 32: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Page 33: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Going a bit further

Deriving the FBS; one wants to get:

x = Argminx

f(x) + g(x)

convex and differentiablewith L-Lipschitz gradientconvex but not differentiable

The main idea consists in building an approximation functional that gives an upper bound on f+g:

A(x, z) = f(x) + g(z) + hx� z,rg(z)i+ L

2kx� zk2`2

After some basic calculation, this yields:

A(x, z) = f(x) + g(z) +L

2

����x� (z � 1

L

rg(z))

����2

`2

Page 34: Course #4 A very short introduction to ... - jerome bobinjbobin.cosmostat.org/MVA_2018/Session4_1_AlgoProx.pdf · 1 Course #4 - A (very) short introduction to proximal algorithms

CS-Orion meeting - 01/28/2011 Course #4 - Proximal algorithms

Going a bit further (2)

This approximation functional admits a unique minimizer over x:

mA(z) = Argminx

f(x) +L

2

����x� (z � 1

L

rg(z))

����2

`2

which is no more than the proximal operator of f applied as follows:

mA(z) = prox

1L f

✓z � 1

Lrg(z)

The FBS then reduces to :

x

(t+1) = mA(x(t))