optoelectronic stochastic parallel processors for real time image processing groupe physique des...
TRANSCRIPT
Optoelectronic Stochastic Parallel Processors for real time image processing
Groupe Physique des images. Ph. Lalanne, J.C.Rodier et P.ChavelInstitut d’Optique Théorique et appliquée, CNRS URA 14
A.Dupret, E.Belhaire and P.GardaInstitut d’Electronique Fondamentale, CNRS URA 22.
Alvaro CASSINELLI
I.Glaser and A.A.Friesem Weizmann Institute, Physics of Complex Systems.
2Plan
I) Low level image processing
Mathematical Model (energy function)
Combinatorial problem and computational load (stochastic optimization).
II) Parallel Processor Implementation
First prototype on CMOS : SPIE600 (24x25)
Noise cleaning demonstration using SPIE600
Demonstrator enhanced with Optical Interconnections
Motion Detection Demonstrator
III) Modeling of an hybrid Silicon/AsGa Parallel Processor
Monolithic convolution : proposed architecture
Performance evaluation
IV) Further research
I) Optimization and Low-level image processing
Bayesian approach to image restoration
Optimisation problem
4
Inverse problem :
Bestimation...
observation...
A?
…there are too many possible solutions
…introducing a priori knowledge enables us to consider only the more probable !
)()/().()/(
BPxBPxPBxP
This is a “probabilistic regularization" technique (Bayesian restoration)
a priori term:- Favor smooth images(MRF field)
Likelihood term : - Favor images which “looks like the observation”
- it is constant on x
(a posteriori probability )
We look for the image A maximizing P(x/B)...
Image restoration : Model
5
Many low level image processing
tasks …
… can be seen as an optimization problem (absolute minimum of an energy function)
bad
good
Conclusion
Optimisation problemAlvaro Cassinelli:
REM: Il existe bien d'autres formes d'estimateurs que le MAP.
Le modèle bayesien est aussi utile pour grand nombre de problèmes en traitement d'images autres que les problèmes de restauration, car il représente une façon formelle pour introduire des informations a priori dans le problème.
Alvaro Cassinelli:
REM: Il existe bien d'autres formes d'estimateurs que le MAP.
Le modèle bayesien est aussi utile pour grand nombre de problèmes en traitement d'images autres que les problèmes de restauration, car il représente une façon formelle pour introduire des informations a priori dans le problème.
)(),()/( xregUBxobs
UBxU +=
From observation model
),(exp1)/( BxU
ZBxP
rewriting the a posteriori probability as :
we define the « energy » of the image :
From regularisation model
6Exemple : binary images
… if image is binary :
.:::
....010
....110
....101
x
),( ),(),(),(
1.2.121.2.12)/(ji jiVlk
klijji
ijij xxBxBxU
Interaction neighborhood
),( ),(),(
2
),(
2)/(ji jiVlk
klijji
ijij xxBxBxU
Energy
Image B Image x
a priori : avoid discontinuities
Constraint to the data
Interpretation
Constraint to the data term Regularisation term
7Practical Resolution : Deterministic updating
image 256x256 Energy function has 65536 variables !Problem :
Exhaustive search ? Impossible ! (2256x256 1020000 images)
First idea : deterministic updating
Random initialisation :
For each pixel (i,j)…
U0=U(x0)=U(1,0,1,1,0,….)
U(1,0,?,1,0,….)
…choose the value that makes the energy decreases :
ijij UF = U(…, xij= 0, ...) - U(…, xij=1, ...)
" Local FORCE "Fij>0
New state 1
New state 0
Fij<0
8Practical resolution : stochastic optimization
The problem :
Deterministic optimization
local minima xO of the energy !
xO
U<0 U>0
x=A
…image is “trapped” in a
- controlled “mistakes” are allowed :
Simulated Annealing
Stochastic Optimization
Solution :
9
Repeat by slowly decreasing the « temperature » :
For each pixel (i,j), compute the local force :
… and choose more likely the state which makes the energy decrease, (but not systematically) :
T
Fx
ijij
exp1
1]1Pr[
Stochastic updating of image :
-4 -2 0 2 40
0,5
1
T=
T=0
Force
Pro
bab
iliy
ijij UF = U(…, xij= 0, ...) - U(…, xij=1, ...)
Practical resolution : Simulated Annealing Algorithm
10Practical resolution : Parallel Updating
For many applications, computation of the force is a local operation :
ji,)lk,(
klijij 1.212V
xBF
Parallel Processor
Interconnection topology
11Conclusion
Low-level Image processing
Optimization problem
Stochastic
Processor
P P P P S S
Energy Function
Parallel
Updating
Local force computation
Optoelectronics ...
OO
II) Optoelectronic implementation
• Motion detection demonstrator
• First prototype : CMOS 24x25 SPIE600 processor
• Demonstration : restoration of binary images
• Enhanced demonstrator using optical interconnects
13SPIE600 parallel processor
CMOS circuit 0,8 m
one bit memory xij
local force computation :
deterministic updating ! (comparator)
optical (2 PD) electrical (currents)
)ji,()lk,(
klijij 1.212V
xBF FC FS
Elementary processor
xij
FS
FC
Noise cleaning demonstrator
(on binary images)
15Noise cleaning demontrator : architecture
Laser source
CCD
SLM SPIE600
Corrupted binary image
IN
Restored image
OUT
pixel Bij
FS (-)
+-
FS (+)FC (+)
FC (-)
(dual rail-encoding)
PEij
xij
)ji,()lk,(
kijij 1.212V
lxBF FC FS
…deterministic updating !
16…stochastic updating
+
=
Speckle generator
Laser diode
multimode fiber
Rotating diffuser
Deterministic comparator (fixed threshold)
xij
FS (+)
FS (-)
FC (+)
FC (-)
“Stochastic” comparator T=0
T2>T1>0
Fij>0Fij<0P
rob
ab
ilité
Laser power T force locale
(seuil aléatoire)
FijT xij
17Optoelectronic Simulated Annealing
- 50 temperature steps
- several thousand updating cycles
- … everything in 40 ms
Simulated Annealing :
Image CCD (annealing)
Laser source
CCD
SLM SPIE600
from Speckle generator
18Noise cleaning demonstrator : view of the setup
19Noise cleaning demonstrator : some experimental results
Slow computer interface «only» video rate (23 images/sec)
Without speckle (deterministic)
With speckle (stochastic)
Enhanced demonstrator with
Optical interconnections
21
Electrical interconnections : cumbersome, short range, fixed...
Shift-invariant neighborhood optical convolution !
Optical interconnections : optical convolutionAlvaro Cassinelli:
à partir de ce moment, les interconnexions électriques ne sont plus utilisées
Alvaro Cassinelli:
à partir de ce moment, les interconnexions électriques ne sont plus utilisées
Dammann gratings (2-level phase)
SLM (B) SPIE600
(4 neighboors)
(8 neighboors)
4-f architecture
+
-
+
-
+
-
+
-
PE (i,j-1)
PE (i+1,j)
PE (i-1,j)
+
-
PE (i,j+1)
Optical neighborhood
22Optical interconnection : test of optical loop
4 nearest neighborhood interconnection topology :
8 nearest neighborhood interconnection topology :
Alvaro Cassinelli:
-aberrations
-il faut détruire la coherence spatiale des sources laser des SLMs
- de toute manière, les irrégularités du voisinage n'affecteront pas outre mesure le résultat du traitement (voisinage)
Alvaro Cassinelli:
-aberrations
-il faut détruire la coherence spatiale des sources laser des SLMs
- de toute manière, les irrégularités du voisinage n'affecteront pas outre mesure le résultat du traitement (voisinage)
23
source laser
CCD
SLM (A)SPIE600SPIE600
From laser diode fiber A
FC
SLM (A)
Optically interconnected demonstrator : optical “force”
)ji,()lk,(
klijij 1.212V
xBF
SLM A
FC
SLM B + grating
FS
diode B power
diode A power =
Speckle generator
From laser diode fiber B grating
FS
SLM (B)
Electronic feed-back loop
(corrupted image)
input B
Aoutput
(restored image)
24
SLM-A
SLM-B
CCDSPIE600
Speckle generator input
Dammann grating
Laser diodes
21 cm
35 cm
14 cm
Optically interconnected demonstrator
Motion detection demonstrator
(with optical interconnections)
26Motion detection : principles
Goal :Input data : gray level image sequence
Output : image segmented in two motion regions (Binary Mask)
Hypothesis : Still camera
Quasi-constant scene illumination...
Relationship between luminance and motion
I(t)
MB(t)
27
- Previous field BM(t-1) is known…
- Observed field (input) CD(t) ...MB(t) ?
?
Changes Detection Binary field
I(t)
I(t-1)
BM(t)
-
BM(t-1)
?
Motion detection : algorithm
…without noise :
CD =1 : flip motion label
CD =0 : save motion label
MB(t)= CS(t) XOR MB(t-1)
XOR
…with noise :
!?
B(t)= CD(t) XOR BM(t-1) BM(t)
Restoration of the binary «image» :
XOR
Rest.
28
Detected motion(Binary Mask)
Motion detection : experimental results
Performances :
Input sequence Change Detection
24 sec/image (optical 8 nearest neigh.)
5 sec/image (optical 4 nearest neigh.)
0,04 sec/image (electronic 4 nearest neigh.)
III) Modeling of an hybrid Si/AsGa Processor
• Hybrid technology and monolithic convolution
• Performance estimation
30Hybrid prototype : “monolithic” convolution
…slow optical loop because of electronics !!
«Hybrid» technology
CMOS circuit : computation (threshold and memory)
MQW AsGa array : input/output
bonding pads
Modulation/detection
GaAs
Alvaro Cassinelli:
-la puce en AsGa : détecte et module
-la puce Silicium : calcule
Alvaro Cassinelli:
-la puce en AsGa : détecte et module
-la puce Silicium : calcule
“monolithic” feed-back ?
Source/detector pixel array !!
31Hybride CMOS/MQW prototype : evaluated architecture
(the image to be processed is entered electronically)
Global architecture
Hybrid circuit
grating
Read laser beam
Array illuminator
Elementary processor
modulators detectors
+-
xij
xij FS(+)
FS(-)
xij
Bij
- pair of modulators
- pair of detectors
- 2 bits : xij et Bij
32Prototype en technologie hybride : bilan optoélectronique
System performances are mainly limited by the optical power
0 10 20 30 40 50 60 70 80 90100
101
102
103
104
Heat dissipation
Q =5W
cadence
0 10 20 30 40 50 60 70 80 90100
101
102
103
104
Electrical power
P =5W
cadence
83,3 MHz
0 10 20 30 40 50 60 70 80 90100
101
102
103
104
105
106
Updating frequency (MHz)
Ptot= 1W
f4=73.3 MHz
f8=66.8 MHz
f12 =64.1 MHz
Hybrid array 16x32 PEs
Optical Power (mW)
33Hybrid CMOS/MQW prototype : final performances
In the worst case, 500 times video rate ! (independently on the array size)
Système de voisinage Nombre de recuits par seconde
SPIE600 (voisinage electronique) 23
SPIE600 (voisinage optique) < 0,02
Circuit hybride (voisinage 4) 36700
Circuit hybride (voisinage 8) 16700
Circuit hybride (voisinage 12) 12800
IV) Conclusion
Results
Possible research directions
35Results
Performances :
Optically interconnected system
Dedicated Machines for image processing
Optics in computing :
Optoelectronic demonstrator :
- noise cleaning, motion detection- Real time simulated annealing
- Silicon chip : 5 seconds/image- Hybrid circuit : tens of thousand images/sec
Optical random number generation
Compact
36
Inte
rco
nn
ecit
on
op
tica
l ban
dw
ith
Compared performances
10 101 10 2 103 104 10 5 106 10 7 1080
Number of optical interconnections in the system
1010
10
10
10
10
10
10
2
4
6
8
10
12
SYSTEM performances (bits/sec)
1012
1014
108
HERE/98
BS/98
CITR/95
AT&T/93
SPARCL/95
AT&T/91
PPOS4
E4O4
O8 O12
PPOS12
OHM4 2002OHM4 2007
FET-SEED
CMOS-SEED
S-SEED
CLF/CI + PnpN
CLF/CI + Si
VCSEL/MSM
Multi-chip
demonstrators
PPOS8
Tera Bit /sec
37Possible research directions...
Rapid and local reconfigurable interconnects :
Optical random number generator :
OSPP with reconfigurable optical interconnects :
…a general purpose optoelectronic stochastic parallel processor for low-level image processing ?
…stochastic neural networks (Boltzmann Machines)
...cryptograpy, telecom channel simulation, Monte Carlo methods...
Alvaro Cassinelli:
(speckle : Gb/s contre Mb/s pour puce VLSI, 1998)
Alvaro Cassinelli:
(speckle : Gb/s contre Mb/s pour puce VLSI, 1998) Micro-optic integration.