fast bilateral filter gpu implementation · fast bilateral filter gpu implementation multi-core...
TRANSCRIPT
![Page 1: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/1.jpg)
Fast Bilateral Filter GPU implementation
Multi-Core Architectures and Programming
Gerhard Mlady, Rafael Bernardelli
Hardware/Software Co-Design, University of Erlangen-Nuremberg
July 21, 2016
![Page 2: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/2.jpg)
Overview
• Fast Bilateral Filter
• Implementation
• Benchmark
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 2
![Page 3: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/3.jpg)
Introduction to Bilateral filter
First lets take a look on 2d convolution
𝑓 𝑥 =
𝑥′∈𝒩(𝑛)
𝑔 𝑥′ 𝑘(𝑥 − 𝑥′)
𝑓 𝑥 = 𝑔 𝑥 ∗ 𝑘(𝑥)
𝑓 𝑥 = ℱ−1{ℱ 𝑔 𝑥 ℱ 𝑘 𝑥 }
Gaussian kernel
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 3
![Page 4: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/4.jpg)
(a) Original image
(b) Gaussian filtered image
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 4
![Page 5: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/5.jpg)
(a) Original image
(b) Gaussian filtered image
(c) Bilateral filtered image*
(d) Bilateral filtered image**
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 5
![Page 6: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/6.jpg)
(a) Original image
(b) Bilateral kernel
(c) Result
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 6
![Page 7: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/7.jpg)
Bilateral filter equations
𝑓 𝒙 = 𝑘−1(𝒙)
𝑥′∈𝒩
𝑔 𝒙′ 𝑐 𝒙, 𝒙′ 𝑠(𝑔 𝒙 , 𝑔(𝒙′))
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 7
𝑓 𝒙 ∶ Output image
𝑔 𝒙 ∶ Input image
![Page 8: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/8.jpg)
Bilateral filter equations
𝑐 𝒙, 𝒙′ = 𝑒−𝒙−𝒙′ 2
2
2𝜎𝑑2
𝑠 𝑔(𝒙), 𝑔(𝒙′) = 𝑒−𝑔 𝒙 −𝑔 𝒙′
2
2𝜎𝑟2
𝑘 𝒙 =
𝒙′∈𝒩
𝑐 𝒙, 𝒙′ 𝑠(𝑔 𝒙 , 𝑔(𝒙′))
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 8
Spacial similarity function
Range similarity function
Normalizing factor
𝑓 𝒙 = 𝑘−1(𝒙)
𝑥′∈𝒩
𝑔 𝒙′ 𝑐 𝒙, 𝒙′ 𝑠(𝑔 𝒙 , 𝑔(𝒙′))
![Page 9: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/9.jpg)
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 9
![Page 10: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/10.jpg)
Fast Bilateral Filter
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 10
![Page 11: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/11.jpg)
Fast Bilateral Filter
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 11
![Page 12: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/12.jpg)
Fast Bilateral Filter
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 12
![Page 13: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/13.jpg)
Overview
• Fast Bilateral Filter
• Implementation
• Benchmark
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 13
![Page 14: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/14.jpg)
Implementation
Load imageinto GPU
Fill cubesPerform
separableconvolution
Slicing & nonlinearity
Copy imageback
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 14
![Page 15: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/15.jpg)
Cube filling
𝑊𝐼 𝑊
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 15
![Page 16: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/16.jpg)
Perform separable convolution
𝐼 ∗1 2 12 4 21 2 1
= 𝐼 ∗121∗ 1 2 1
𝑂(𝑁𝑀𝑘2) 𝑂(2 ∗ 𝑁𝑀𝑘)
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 16
𝑁,𝑀 ∶ Image dimensionsk : convolution kernel length
![Page 17: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/17.jpg)
Perform Slicing & Nonlinearity
𝑊𝐼 𝑊
output
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 17
![Page 18: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/18.jpg)
Texture fetching
21/07/2016 18Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation
![Page 19: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/19.jpg)
Overview
• Fast Bilateral Filter
• Implementation
• Benchmark
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 19
![Page 20: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/20.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 20
![Page 21: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/21.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 21
![Page 22: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/22.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 22
![Page 23: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/23.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 23
![Page 24: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/24.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 24
![Page 25: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/25.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 25
![Page 26: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/26.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 26
![Page 27: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/27.jpg)
Benchmark
for intensity kernel length, intensity and spatial scalling 11
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 27
![Page 28: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/28.jpg)
Benchmark – Best method for filling
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 28
𝑊𝐼 𝑊
![Page 29: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/29.jpg)
__global__ void cubefilling_loop(const float* image, float *dev_cube_wi, float *dev_cube_w, const dim3 image_size, int scale_xy, int scale_eps,dim3 dimensions_down)
{
unsigned int i = blockIdx.x * blockDim.x + threadIdx.x;
unsigned int j = blockIdx.y * blockDim.y + threadIdx.y;
if (i < dimensions_down.x && j < dimensions_down.y) {
size_t cube_idx_1 = i + dimensions_down.x*j;
#pragma unroll
for (int ii = 0; ii < scale_xy; ii++)
{
#pragma unroll
for (int jj = 0; jj < scale_xy; jj++)
{
size_t i_idx = scale_xy*i + ii;
size_t j_idx = scale_xy*j + jj;
if (i_idx < image_size.x && j_idx < image_size.y)
{
float k = image[i_idx + image_size.x*j_idx];
size_t cube_idx_2 = cube_idx_1 +dimensions_down.x*dimensions_down.y*floorf(k / scale_eps);
dev_cube_wi[cube_idx_2] += k;
dev_cube_w[cube_idx_2] += 1.0f;
}
}
}
}
}
21/07/2016 29Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation
![Page 30: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/30.jpg)
__global__ void cubefilling_atomic(const float* image, float *dev_cube_wi, float *dev_cube_w, const dim3 image_size, int scale_xy, int
scale_eps, dim3 dimensions_down)
{
const size_t i = blockIdx.x * blockDim.x + threadIdx.x;
const size_t j = blockIdx.y * blockDim.y + threadIdx.y;
if (i < image_size.x && j < image_size.y) {
const float k = image[i + image_size.x*j];
const size_t cube_idx = (i / scale_xy) + dimensions_down.x*(j / scale_xy) +
dimensions_down.x*dimensions_down.y*((int)k / scale_eps);
atomicAdd(&dev_cube_wi[cube_idx], k);
atomicAdd(&dev_cube_w[cube_idx], 1.0f);
}
}
21/07/2016 30Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation
![Page 31: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/31.jpg)
Benchmark – Best method for filling
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 31
![Page 32: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/32.jpg)
Benchmark – Best method for filling
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 32
![Page 33: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/33.jpg)
Benchmark – Best method for filling
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 33
![Page 34: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/34.jpg)
Benchmark – Best method for filling
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 34
![Page 35: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/35.jpg)
Benchmark
21/07/2016 35Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
51 102 153 204 256 307 358 409 460 512
Image size
Relative runtime on Telsa K20c
convolution cubefilling slicing
![Page 36: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/36.jpg)
Benchmark - CPU
for intensity kernel length 21 and spatial k.l. 25
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 36
![Page 37: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/37.jpg)
Issues on the Implementation
• On windows, a register key must be set to allow the GPU
to run a kernel more than 2 seconds(HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers > TdrDelay=10)
• On (almost) every CUDA API call, retrieve cudaStatus
and check for errors. It doesn’t raise any exceptions.
• On linux, add some lines to .bashrcexport PKG_CONFIG_PATH=/scratch-local/usr/lib64/pkgconfig/
export PATH=$PATH:/opt/cuda/bin
export LD_LIBRARY_PATH=/scratch-local/usr/lib/:$(LD_LIBRARY_PATH)
21/07/2016 37Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation
![Page 38: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/38.jpg)
If you are interested on the code and more
plots…
• https://github.com/bernardelli/Bilateral-Filter-GPU
21/07/2016 38Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation
![Page 39: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/39.jpg)
21/07/2016 39Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation
Thank You For Your Attention!
![Page 40: Fast Bilateral Filter GPU implementation · Fast Bilateral Filter GPU implementation Multi-Core Architectures and Programming Gerhard Mlady, Rafael Bernardelli Hardware/Software Co-Design,](https://reader035.vdocuments.site/reader035/viewer/2022081600/605e7f19bb920923a307e311/html5/thumbnails/40.jpg)
References
• https://developer.apple.com/library/prerelease/content/documentation/
Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionO
perations.html
• http://homepages.inf.ed.ac.uk/rbf/HIPR2/gsmooth.htm
• C. Tomasi and R. Manduchi Bilateral Filtering for gray and color
images. In Proceedings of IEEE International Conference on
Computer Vision, pages 839-846
• http://people.csail.mit.edu/sparis/publi/2006/eccv/Paris_06_Fast_Appro
ximation.pdf
• http://docs.nvidia.com/cuda/cuda-c-programming-
guide/index.html#linear-filtering__linear-filtering-of-1-d-texture-of-4-
texels
21/07/2016Mlady, Bernardelli | Multi-Core Architectures and Programming | Fast Bilateral Filter GPU implementation 40