image processing case study€¦ · dsp design grain recognition increasing filter size dmore...
TRANSCRIPT
DSP Design
Case Study
Image Processing
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Image processingFrom a hardware perspective
• Often massively parallel– Can be used to increase throughput
• Memory intensivey– Storage size– Memory bandwidth
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
2 di i l I C l ti2-diemensional Image Convolution
NN
MxMFor each pixel position in the image, the kernel value is
N
multiplied with the underlying pixel value and those are added to produce the output value:
( ) ( ) ( )21221121 ,,, mmhmkmkxkky ∑∑ −−=
p p
MA frame is added to avoid border effects.
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
2 Image processing is memory intensive!
DSP Design
Edge detection and zero crossings with different kernel size
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP DesignGrain recognitiong
Increasing filter size more calculations
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
Edge detection
DSP DesignGrain recognition
Increasing filter size more calculationsg
Filter sizeFilter size15 x 15
225multiplications
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
pper pixel
DSP Design
What datapath architecture?• Hardware mapped, i.e. 225 multipliers + adds
• Single MAC (Multiply Accumulate) unit
• Hardware for one column each clock cycle
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Add t t t fAdder tree structure of processor core15 pixels read on each clock cycle
PipelinedPipelinedAdder Tree
A l tAccumulator
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Add t t t fAdder tree structure of processor core15 pixels read on each clock cycle
Increased wordlength to keep precision and avoid overflow
Guard bits in accumulator
and truncated output
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Datapath Chip 1993Datapath Chip, 1993
1μm standard CMOS technology approx 50 000 transistors
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
approx. 50 000 transistorsdie area, 8x6,5 mm2
DSP Design
2-diemensional Image Convolution
NOff-chip image memory
• Large• High power
N
MxMHigh power
Every pixel used in several calculationsN
22 )1( +−MNM
2M
Multi-level memory hierarchy can be used.
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
2
DSP Design
How to use line memoriesInitial fillingg
New line
Each pixel operationEach pixel operation
Only one external memory read per pixel!
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
+ shift one pixel between memoriesmemory read per pixel!
DSP Design
Memory Hierarchy, accessesImage memory
M Line memoriesKernelmemories
Image memoryoff-chip
2N 2MMMN +− )1(
M-1 with N words
Scheme Image Line Kernel
1 witk M words
Image M2(N-M+1)2
Image line N2 M2(N-M+1)2
Image kernel MN(N-M+1) M2(N-M+1)2
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
g MN(N M 1) M (N M 1)Image line kernel N2 MN(N-M+1) M2(N-M+1)2
DSP Design
Memory Hierarchy, energyImage memory 0 35 CMOSImage memory
off-chip, 0.18μm
Line memoriesKernelmemories
0.35μm CMOS
2N 2MMMN +− )1(
4nJ/access 1nJ/access60nJ/access
Scheme energyImage 13.8J
Image line 1.0J
Image kernel 1 2J
N = 1024M = 15Wordlength = 16
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
Image kernel 1.2J
Image line kernel 0.4J
Wordlength 16
DSP Design
Tailored architecture in theTailored architecture in the datapath design
Cache Processor ProcessorProcessor Processor
Image processorwithout controller
Data out - 4x24 bits
Cyclic columnstorage
4 bitsaddress line
Cachelevel 3(15x16)
core1
core4
core2
core3APU
3
New column written System bus 15x8 bitsControl
signals from
Line memories with pipelined registerslevel 2
(15x256)APU
1
Large off-chipmemories
Inputbuffer
New column writtento cache for each
new pixel operation8 bits
signals fromcontroller Kernel moving one
pixel to the right
( )2
1
New value feeded duringeach new pixel operation
level 1 (256x256)
Unfilled memoryelements
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Compared to apTMS320C80 Multimedia Video Processor (MVP)
Published 1995
Designed: 20MHzMVP: 50MHz
MVP: 4 parallel DSPs + 1 master processor [3].
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
MVP: 4 parallel DSPs 1 master processor [3]. Each DSP unit contains one 16x16 bit multiplier, which can be split into two8x8 bit multipliers
DSP Design
M C id tiMemory Considerations
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
We have registers, why memories?
D Flip-flop : 252µm2 Memory element : 30µm2
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Flip-flops vs SRAMFlip flops vs. SRAM
1.8
Alcatel Microelectronics 0.35µm CMOS technology processProcess and library dependent but same trends
1.4
1.6
1.8
Flip-flopsDual port memorySingle port memoryDouble width memory
1
1.2
are
mm
0 4
0.6
0.8
squa
0 500 1000 1500 2000 2500 3000 3500 4000 45000
0.2
0.4
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
0 500 1000 1500 2000 2500 3000 3500 4000 4500memory elements
Crossover approx 200bits for this technology
DSP Design
Hardware Aspects of a Real-time pSurveillance System
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
An Intelligent Surveillance SystemAn Intelligent Surveillance SystemSegmentation Morphology Labeling
Object l ifi i
Tracked Objects
Feature i
Tracking
Input: Video from stationary cameraOutput: T k d Obj t
classificationextractionTracking
Output: Tracked Objects
Spec: Xilinx Virtex II-Pro Development PlatformR l ti 320 240Resolution 320x24025 frames per second
• Architectures for local decisions
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
• Architectures for local decisions• Embedded system requires real time and low power
DSP Design
The PhD Student TeamThe PhD Student TeamThree PhD Students:• Hongtu Jiang; Sensor interface and segmentation.
– PhD February 2007
• Fredrik Kristensen; System Overview, f t t ti d t kifeature extraction and tracking– PhD September 2007
• Hugo Hedberg, Morphology and labeling– PhD April 2008
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
References see: www.es.lth.se/asicdsp
DSP Design
The end resultThe end result
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
System
Segmentation Morph filterSegmentation
CAM
algorithm and labelingalgorithm
Featureextraction
Tracking
Object 1size = 1037position = (56, 180)
l 1 137
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
color_1 = 137…
DSP Design
Segmentation
• Detects motion• Generates a noisy binary mask due to errors
caused by camera, fast light changes etc.
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Background ModelingBackground Modeling
P1 ( x0 , y0 ) P2 ( x0 , y0 ) P3 ( x0 , y0 ) Pn ( x0 , y0 )
Consecutive Video FramesConsecutive Video Frames
11 22 33 nnSample background Sample background
environment in the digital environment in the digital lablab
125
130
Pixel values taken from same Pixel values taken from same location in consecutive videolocation in consecutive video
110
115
120BBlocation in consecutive video location in consecutive video frames looks like a Gaussian frames looks like a Gaussian distribution in RGB color space, distribution in RGB color space, i.e. even when nothing is i.e. even when nothing is happening it’s not a single happening it’s not a single value.value.
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
60708090100110
90100
110120
105
RRGG
value.value.
DSP Design
Multi modal BackgroundMulti-modal Background
120
125
130
150105
110
115BB
50
100
100105110115120125130100
RR
GG
More complicated background pixels such as lake surface and swaying trees have the property of two distributions
i i t G i t d l
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
requiring two Gaussian to model
DSP Design
Video segmentation based onVideo segmentation based onGaussian Mixture background Model(Stauffer and Grimson)
• Detect moving object in image sequences
• Each pixel over time is a “pixel process” modeled • Each pixel over time is a pixel process , modeled by Gaussian distributions
• Each background object correspond to one GaussianMotion
Detection • Each background object correspond to one Gaussian
• GMM is robust for handling multi-modal background situations
Detection
situations
• swaying trees
• lake surface
• etc
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
• etc.
DSP Design
Hardware Implementation ConsiderationsGaussianParameter
Hardware Implementation Considerations
Memory
S tiBitmask
MemoryBottleneck
Sortingof
GaussiansMatchingNetwork
DecisionNetwork Labeling
LabeledBitstream
RGB pixel stream
Fully parallel and pipelined design aiming for one pixel per clock cycle
Post-processing
y p p p g g p p y
Most important design parameter: High memory bandwidth:
15 variables/pixel + RGB
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
15 variables/pixel + RGB,
i.e. 5 parameters for each Guassian distribution x 3
DSP Design
Hardware Implementation ConsiderationsHardware Implementation ConsiderationsGaussianParameterMemory
EncodingDecoding
Sortingof M t hi D i i
Bitmask
LabeledBit t
g+ Buffer
ofGaussians
MatchingNetwork
DecisionNetwork Labeling
RGB pixel stream
Bitstream
Post-processing
• Idea: Neighbouring pixels have similar parameters
• Use some form of Run Length Encoding
RGB pixel stream
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
g g
• Simulations show reduction of memory access by >50%
DSP Design
Memory bandwidth reductionVariance x 2.5
Memory bandwidth reduction
DDR SDRAM
PP t
Mean Gaussian Distribution represented as a Cube
Matching &
BitmaskKodakCMOS
ParameterSaving
ParameterReforming
&Sorting
CMOSSensor
Two Overlapping Gaussian Distributions
(Red Cube)
• Cons: more noise is generated in the binary mask• Pros: If Gaussians with 80% overlap is regarded as the “same” Gaussian, more
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
than 60% memory saving can be expected
DSP Design
Memory Reduction ResultsMemory Reduction Results 0 .9
0 .9 5
on
0 . 90 .80 7 0 7
0 .7 5
on
0 . 7 5
0 .8
0 .8 5
idth
redu
ctio 0 . 7
0 .60 .5
0 .6 5
0 . 7
idth
redu
ctio
0 . 6
0 .6 5
0 .7
mor
y ba
ndw
0 . 5 5
0 . 6
mor
y ba
ndw
0 5 0 0 1 0 0 00 .4 5
0 .5
0 .5 5
F
Mem
0 . 5 0 .6 0 .7 0 . 8 0 .90 .4 5
0 . 5
T h h ld
mem
• Different memory bandwidth savings with different threshold• Too low threshold results in clustered noise that can not be removed by morphology
F ra m e T h re s h o ld
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Memory Reduction Results
Segmentation with different thresholddifferent threshold
Results after morphologymorphology
Clustered
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
noise
DSP Design
Segmentation results
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
Shadow reduction is important!
DSP Design
Original image Output image after segmentation
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
System
Segmentation Morph filterMorph filter
CAM
algorithm and labelingand labeling
Featureextraction
Tracking
Object 1size = 1037position = (56, 180)
l 1 137
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
color_1 = 137…
DSP Design
Segmented input image Output image after clustering
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
MorphologyGreek morphe ”shape” ology ”the study of”Greek morphe ”shape”, –ology ”the study of”
The study of shapes
• Applies to many number representations
Arbitrary binary image
pp y p– In our application, only binary input is considered
• Structuring element (SE)– Sliding window
1/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/0
Sliding window
1/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/01/0
11 11 11
11 11 11
3x3 SEOrigin
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
11 11 11
DSP Design
MorphologySimilar to convolution but more on the logic
levelImportant operations• Erosion: Shrinks (minimum)• Dilation: Expands (maximum)p ( )• Opening (erosion followed by dilation):
– Noise reduction• Closing (dilation followed by erosion):g ( y )
– Reconnect split objects
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology
ErosionSE
DilationDilation
Opening:• Erosion followed by dilationos o o o ed by d at o
– Noise reduction
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology i (” d”)Morphology, erosion (”and”)
000000 000000
011
011
110
100
011100
111
111 =
000
001
000
000
000000
000000
000000
111000000
000000
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology dil ti (” ”)Morphology, dilation (”or”)
000000 111111
011
011
110
100
011100
111
111 =
111
111
111
111
111111
000000
000000
111111111
111111
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology lidi i dMorphology, sliding windowDirect-mapped implementation
ffff ffff
654321
2211 33Index image
FIFOFIFO4,5,64,5,6
ffff ffff FIFOFIFO
ffff ffff InputInput 242322212019181716151413121110987654321
77 88 99
1313 1414 1515
10,11,1210,11,12
16 3616 36
Erosion / DilationErosion / Dilation
ffff ffff InputInput
36353433323130292827262524232221201916,..,3616,..,36
OutputOutput
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology lidi i dMorphology, sliding windowDirect-mapped implementation
ffff ffff
654321
3322 44Index image
FIFOFIFO5,6,75,6,7
ffff ffff FIFOFIFO
ffff ffff InputInput 242322212019181716151413121110987654321
88 99 1010
1414 1515 1616
11,12,1311,12,13
17 18 1917 18 19
Erosion / DilationErosion / Dilation
ffff ffff InputInput
36353433323130292827262524232221201917,18,1917,18,19
OutputOutput
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology lidi i dMorphology, sliding windowDirect-mapped implementation
ffff ffff
654321
23232222 2424Index image
FIFOFIFO25,26,2725,26,27
ffff ffff FIFOFIFO
ffff ffff InputInput 242322212019181716151413121110987654321
2828 2929 3030
3434 3535 3636
31,32,3331,32,33
-- -- --
Erosion / DilationErosion / Dilation
ffff ffff InputInput
363534333231302928272625242322212019,, ,,
OutputOutputPros: • Supports arbitrary SEs
Cons: • Unsuitable for large SEs
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
pp y g
Compare to 2D- Convolution Architecture!
DSP Design
DecompositionDecomposition
21 BBB +=
SS
SESEwidthwidth
SE
SE
heightheight
SESEwidthwidthx SEx SEheightheight ==
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology i (” d”)Morphology, erosion (”and”)
000000 000000
011
011
110
100
011100
111
111 =
000
001
000
000
000000
000000
000000
111000000
000000
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Morphology i (” d”)Morphology, erosion (”and”)
000000 000000
011
011
110
100
011100
111
111 =
000
001
000
000
000000
000000
000000
111000000
000000
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Decomposition 1st stepDecomposition 1st step
000000 000000
011
011
110
100
011100111 =
001
001
100
000
001000
000000
000000
000000
000000
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
D iti 2nd tDecomposition 2nd step
000000 000000
=
001
001
100
000
001000
1
1
000
001
000
000
000000001000
000000
000000
1000000
000000
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
MorphologyA hit tA hit t
MuxMux
ffff
Mux
Mux
’0’’0’ArchitectureArchitecture
001122ffff
MuxMux
=SE width=SE width++InIn OutOut..., 0, 1, 1, 1, 1, 0,......, 0, 1, 1, 1, 1, 0,... =3=3 ..., 0,......, 0,......, 0, 0,......, 0, 0,......, 0, 0, 0, ......, 0, 0, 0, ......, 1, 0, 0, 0, ......, 1, 0, 0, 0, ......, 1, 1, 0, 0, 0, ......, 1, 1, 0, 0, 0, ......, 0, 1, 1, 0, 0, 0, ......, 0, 1, 1, 0, 0, 0, ...
Stage 1: Stage 1: Number of ones Number of ones in the same rowin the same row
SESEwidthwidth
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
MorphologyA hit tA hit t
MuxMux
ffff
MuxMux
Mux
Mux
Mux
MuxRow memRow mem
’0’’0’ ’0’’0’ArchitectureArchitecture
MuxMux
=SE width=SE width =SE height=SE height
MuxMux
++ ++InIn OutOut
Stage 1: Stage 1: Number of ones Number of ones in the same rowin the same row
Stage 2: Stage 2: Number of Number of consecutive lines with consecutive lines with
SE width onesSE width onesSE width onesSE width ones
SESEwidthwidth
SE
SE
hh SESE idthidthx SEx SEh i hth i ht==
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
eighteight
SESEwidthwidthx SEx SEheightheight
DSP Design
MorphologyA hit tA hit t
MuxMux
ffff
MuxMux
Mux
Mux
Mux
MuxRow memRow mem
’0’’0’ ’0’’0’ArchitectureArchitecture
MuxMux
=SE width=SE width =SE height=SE height
MuxMux
++ ++InIn OutOut
Stage 1: Stage 1: Number of ones Number of ones in the same rowin the same row
Stage 2: Stage 2: Number of Number of consecutive lines with consecutive lines with
SE width onesSE width onesSE width onesSE width ones
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
DualityDuality
( )''A B A B+ = −( )( )''
A B A B
A B A B
+
− = +( )A B A B= +
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Duality exampleDuality, exampleA B+
111111
'A B−000011011
000110100
011100
111111
111100100
111001011
100011
111111011100
000000000000
111100011111111111111
111
111111
'A B− '( )´A B−000000
111110
111111
111111
000001
000000
000000
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
111111111111
000000000000
DSP Design
DualityDuality
( )''A B A B+ = −( )( )''
A B A B
A B A B
+
− = +( )A B A B= +
Both operations on same hardware by inverting the input and output
streams.
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
MorphologyMorphology
A hit tA hit t
MuxMux
Mux Mux
ffff
Mux Mux MuxMux
Mux
Mux
Mux
MuxRow memRow mem
’W’’W’ ’N’’N’’0’’0’ ’0’’0’
NorthNorthWestWest
OperationOperation
ArchitectureArchitecture
SouthSouth&&
Mux
Mux
MuxMux
=SE width=SE width
Mux Mux
=SE height=SE height
Mux
Mux
Mux Mux MuxMux
++ ++
InIn
OutOut
Mu
Muxx
OperationOperationEastEast
’1’’1’
uu
Stage 0: Stage 0: InvertsInvertsif dilation is if dilation is
ff
Stage 1: Stage 1: Number of ones Number of ones in the same rowin the same row
Stage 2: Stage 2: Number of Number of consecutive lines with consecutive lines with
SE width onesSE width ones
Stage 3: Stage 3: InvertsInvertsif dilation is if dilation is
f df dperformedperformed SE width onesSE width ones performedperformed
DualityDuality
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DualityDuality
DSP Design
MorphologyMorphologyIn our applicationIn our application• Noise reduction• Reconnect split objectsReconnect split objects
Low complexity architecture with low memory requirements
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
PrototypePrototype
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
E b dd d H d Pl tfDDR memory
Embedded Hardware Platform
Bus
DDR memory
Segm. Morph LabelFeat. Mem 1
Feat. Mem 0
SWMem
Sensor
Bus
FIFO
Mem 1 Mem
Read-&
Draw-b
PPC
Label Mem 1
Label Mem 0
DISPLAY
boxes
ResultM
VGAMemory
VGACTRL
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
MemFPGA-chip
DSP Design
Di it l H l hDigital Holography
Transposition
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Digital HolographyDigital Holography
Application• A digital image sensor replaces the
photographic filmppMicroscope based on
Digital Holography
p g p• Interference pattern, reference and object
light is captured separately• Computer algorithm generates the image
LaserReference Light
Object
Light
ObjectDigital image sensor
Object Light
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Advantage 1 - Phase informationMakes transparent objects visible
AmplitudeUnwrapped phaseRefraction index
Makes transparent objects visible
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Ad t 2 FAdvantage 2 – FocusAll focus information in one single recording
Head of a greenfly
1 mm
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Phase Holographic Imagingcell analyzer to envision and monitor transparent living cells in vitro, in
their growth environment without the need for artificial staining and makes quantification of a large number of parameters possible to perform in real-time
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
Pseudo 3D-image of cells generated from the phase information. www.phiab.se
DSP Design
Time-lapse study of cell division:Time lapse study of cell division:Wilms' tumor is a rare type of kidney cancer that affects children.
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Ti l t dTime-lapse study: a sequence of consequtive imagesq q g
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
•Important issues•Important issues•Processing and efficiency
• Processor vs. FPGA/ASIC
•Memory access and throughputFFT S l ti
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
•FFT Selection
DSP Design
XSTREAM - 2D FFT• A two-dimensional FFT can be evaluated by
– Applying a one-dimensional FFT over the rows– Applying a one-dimensional FFT over the column of the resultApplying a one dimensional FFT over the column of the result
• Burst read Column access is slow– Transpose the memory between operations and only operate on
rows
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
Memory and throughput
Overhead = (Setup+N) / NOverhead = (Setup+N) / N
N=1 Overhead 800%N=32 Overhead 21% Burst access
0 N-1
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
XSTREAM - Transpose• Divide the matrix into macro-blocks (32x32)
– Transpose macro-blocks individually– Relocate transposed macro-blocksRelocate transposed macro blocks
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
DSP Design
XSTREAM - 2D FFT
Viktor Öwall, Dept. of Electrical and Information Technology, Lund University, Sweden-www.eit.lth.se
A ”rather” small burst size gives a large gain!