report on icam2006 implementation

7
iCAM2006 in Apple Quartz Composer Michael Lewis September 11, 2008 1 Project Description The original intent of this project was to implement the High Dynamic Range tone mapping algorithm, for the image appearance model, iCAM06[3], in Ap- ple’s Quartz Composer. Quartz Composeris a visual programming language where image filters, sound filters, and other media related compositions can be created using a drag and drop interface with little coding (if desired). Image processing is very computational intensive, the ability to offload the processing on more specialized hardware such as a graphics processing unit (GPU) on a video card is very desirable, especially for larger images and more complex operations. Specialized hardware like GPUs allow the same calculations performed on a general-purpose microprocessors orders of magni- tudes faster. For example, n 2003, a GPU could perform an RGB L * a * b* transformation up to 10 times faster than a general-purpose microprocessor [2]. More complex operations can be performed on the GPU, such as Fast Fourier Transformations. More recent research has shown that the GPU can perform between 10 times and 200 times faster than a general CPU for various operations [1]. Quartz Composerexploits the power of the GPU for image processing in two ways. The first way is by offloading most common operations such as blurs, color transforms, etc. onto the GPU. The second is by allowing the user to create custom filters, also known as patches, with a C-like language which can also be considered a subset of OpenGL Shading Language (GLSL). GLSL allows programmers to write code in a familiar language that is compiled into code executable by the GPU at runtime. Although originally designed as a pixel shading language, it has been used for more general math-intensive computing as well.‘ At a first glance, Quartz Composerprovides a very intuitive programming environment for image processing, and potentially the performance required 1

Upload: mikelikespie

Post on 18-Nov-2014

262 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Report on iCAM2006 Implementation

iCAM2006 in Apple Quartz Composer

Michael Lewis

September 11, 2008

1 Project Description

The original intent of this project was to implement the High Dynamic Rangetone mapping algorithm, for the image appearance model, iCAM06[3], in Ap-ple’s Quartz Composer. Quartz Composeris a visual programming languagewhere image filters, sound filters, and other media related compositions canbe created using a drag and drop interface with little coding (if desired).

Image processing is very computational intensive, the ability to offloadthe processing on more specialized hardware such as a graphics processingunit (GPU) on a video card is very desirable, especially for larger images andmore complex operations. Specialized hardware like GPUs allow the samecalculations performed on a general-purpose microprocessors orders of magni-tudes faster. For example, n 2003, a GPU could perform an RGB ⇒ L∗a∗b∗transformation up to 10 times faster than a general-purpose microprocessor[2]. More complex operations can be performed on the GPU, such as FastFourier Transformations. More recent research has shown that the GPUcan perform between 10 times and 200 times faster than a general CPU forvarious operations [1].

Quartz Composerexploits the power of the GPU for image processing intwo ways. The first way is by offloading most common operations such asblurs, color transforms, etc. onto the GPU. The second is by allowing the userto create custom filters, also known as patches, with a C-like language whichcan also be considered a subset of OpenGL Shading Language (GLSL). GLSLallows programmers to write code in a familiar language that is compiled intocode executable by the GPU at runtime. Although originally designed as apixel shading language, it has been used for more general math-intensivecomputing as well.‘

At a first glance, Quartz Composerprovides a very intuitive programmingenvironment for image processing, and potentially the performance required

1

Page 2: Report on iCAM2006 Implementation

to process images in a timely fashion, we felt that it would be an idealenvironment to implement iCAM06for HDR image mapping in.

2 Implementation

The first iteration of the iCAM06implementation was an attempt to fullyimplement it without having to refer to existing implementations such asthe MATLAB one or a previous attempt in Quartz Composerby a differentauthor. The main reference used was iCAM06: A refined image appearancemodel for HDR image rendering[3].

This attempt went well until there were implementation issues that weren’texplained fully in paper such as details on parameters for bilateral filtering,good defaults, etc.

The next attempt was done using the MATLAB code and previous QuartzComposerattempt. This one went much better (and lasted a lot longer).Although much progress was made, there were several “show stoppers” withQuartz Composer.

3 Quartz ComposerImplementation Issues

3.1 Bilateral Filter/Base Layer

I was told the Bilateral filter would be an issue when I took on this project.Fortunately, I found an implementation for a bilateral filter on the GPU onthe Internet (http://gpubilateral.sourceforge.net/). There were a fewshortcomings of this code, which weren’t really fixable by neither rewritingit nor tweaking it, but a limitation of the GL shading language and GPUs ingeneral.

The problem with GLSL (and the implementation done by Apple) is thatthere are no branches allowed in the code. This means every for loop you havein the code has to be unrolled by the compiler. This means the following:

• Nested loops generate an insane amounts of code, and lots of overhead.

• Variable size kernels in GLSL aren’t possible (i.e. kernels to convolvewith)

• Large kernels are very slow (even on the GPU)

• You can’t query arbitrary pixels

2

Page 3: Report on iCAM2006 Implementation

I was able to work around these limitations by scaling down the imageto a fixed size before applying the bilateral filter, and then scale it backto the original size to get the base layer, but this created some sub-parresults for larger images. The bilateral filter was still quite slow, even afterheavily modifying the original one. Also, I’m not sure if the original bilateralfilter was 100% compatible with the matlab implementation, and it requiredsome magic variables as arguments which didn’t map 1:1 with the MATLABimplementation.

After further research into bilateral filters and getting a larger under-standing of them and the different implementations, I found out the onebeing used was a poor-quality implementation. The one used in MATLABwas an older, slower, less accurate technique as well. This will be discussedfurther in the “Further Research” section.

3.2 Gaussian Filter (for White)

The Gaussian filter (and filters in general) uses black padding versus repli-cating the image. This created some inaccuracy at first, but I was able toovercome this by manually replicating the image 8 times, and compositingit, then applying the Gaussian blur, then cropping it.

Although this made the accuracy better, it was dismal for the perfor-mance. performance applying a large filter to an area 9x as big as the original.Also, the kernel for the Gaussian blur is huge (1/3 of the original image).To optimize for the largeness of the kernel (without using FFTs), QuartzComposerperforms an inaccurate and still slow Gaussian blur.

An example of the replication method (creating 8 copies of the image) canbe found in the Quartz Composercomposition in the patch “White-¿BetterGaussian (Optimized)”. I even added an optimization to crop the compositeimage to only be as large as the original image + filter diameter beforeapplying the filter, but that didn’t help too much.

3.3 IPT Transform

There’s something wrong with the IPT transform matrices, at least whendoing it in Quartz Composer. I copied the the implementation from MAT-LAB line for line and I was still getting weird results. Then I tried doingjust XY Z ⇒ LMS transform and LMS ⇒ XY Z with nothing in betweenand lost a lot of information in the image. Figures 1 and 2 demonstrate theinformation lost in the image with only the color transformations used in theMATLAB implementation.

3

Page 4: Report on iCAM2006 Implementation

Figure 1: Before IPT transform in Quartz Composer

Figure 2: After IPT transform in Quartz Composer

4

Page 5: Report on iCAM2006 Implementation

3.4 Min and Max for Color Channels

I needed to use Min and Max per color channel for the IPT patch, and someothers. I had to get the max value for a certain channel. Fortunately, thereis a patch in the new Quartz Composerto do this. Unfortunately, it maxesout at 1.0, and with HDR images we can get very high values. I overcamethis by dividing the pixes of the original image by 10000 or so, then using themax patch, then multiplying it to get the real value. This worked in one ortwo cases, but failed in another because it’s a dirty hack (like a lot of QuartzComposerwith HDR)

3.5 Clipping for Display

In the MATLAB implementation, before displaying the final image, it clipsthe top 1% of values in the histogram and the bottom 1% of them. There’sno similar function to this in Quartz Composer.

There is a histogram function which creates an array of 256 length witheach element representing which % of the image falls in that bucket. Usingthis, and the Quartz ComposerJavaScript, I was able to approximate the top99% and bottom 1% by iteratively summing the values in the array until Igot to 1% and then taking the index of that and dividing the sum by 256.

This hack worked at first, but subsequently broke. Not only did theJavaScript stop working, but the Image Histogram patch failed eventually aswell. It just stopped working arbitrarily. See patch “Disp⇒get crop nums”to see the details.

3.6 HDR file formats

Quartz Composertreats OpenEXR and .hdr images differently for some rea-son, even if I open a .exr file and save it as a .hdr in Photoshop. I getcompletely different results. From a visual comparison the Quartz Com-poserdisplays both images differently. I can’t test the OpenEXR images inMATLAB, but the Radiance images look much worse than OpenEXR imagesin Quartz Composer.

3.7 Precision/Black Box

With Quartz Composerthere’s no way to know really what’s going on withthe image, how it’s treating pixels, etc. You can’t query the value of apixel, or have debug output or anything (if you can, it’s not documented).

5

Page 6: Report on iCAM2006 Implementation

This makes it nearly impossible to debug and compare with the MATLABimplementation aside from the final results.

For example, if you have a really bright image that’s white on screenbefore you process it (values > 1), you can’t see anything in Quartz Com-poserto even compare it to output from MATLAB, and there’s no way toprobe a pixel to see the value.

The precision of the GPU for transformations is also a potential issue.There definitely seemed to be some dividing by 0 and such in certain imageswhich made blank or completely white pixels.

4 Further Research

After deciding this project wasn’t possible to complete in Quartz ComposerIindependently did some research other facilities to map HDR images usingiCAM06rapidly. The focus of this was looking to optimize the the bilateralfilter.

4.1 Fast Implementations of the Bilateral Filter

When researching Bilateral Filters, my first intention was to get comparablequality of the MATLAB implementation of the bilateral filter with betterperformance. While looking into this topic, I found new techniques for im-plementing bilateral filters that weren’t published at the time the MATLABcode was written. These algorithms aren’t only faster than the techniqueused in MATLAB, but more accurate as well, and relatively simple to imple-ment on the GPU.

According to the paper by Paris and Durand, the MATLAB implementa-tion is using a technique described as a piecewise-linear approximation fromresearch done in 2002. The results of their research in 2006 show the improvedalgorithm which performs better in both execution time and accuracy in theapproximation [4].

4.2 Implementation in CUDA

nVIDIA provides a framework called CUDA for processing data on the GPUcalled CUDA. It’s much more flexible than the GLSL implementation pro-vided in Quartz Composerbecause it allows looping, branching, multipletasks, 32 bit floating point operations, and explicit control over the data.It’s not just intended for image processing like Quartz Composeris. Thismeans that custom algorithms can be used. This approach seemed very

6

Page 7: Report on iCAM2006 Implementation

promising because it is packaged with a library for Fast Fourier Transformswhich are required to perform bilateral filtering efficiently. It even has sup-port for FFT filtering for 3 dimensional arrays which is ideal for the FastBilateral Filtering algorithm described by Paris and Durand [4].

I decided to give implementing HDR mapping using CUDA a shot. Thefirst big task was opening an OpenEXR HDR image in C, and passing thememory to CUDA, reading it out and hoping the image was the same andwriting it back out again. There were some issues doing this at first becausethe whole image is required to fit in the GPU memory. Eventually this wasaccomplished.

The second task was to apply a FFT to the image and then the inversetransform, and then hoping the image is identical to the original one whenwritten out. This wasn’t successful. There operator for the FFT, and theimage and destination couldn’t fit completely in memory on the GPU I amin possession of. I tried the scaled down image too, and wasn’t successfulwith that either.

At this point, any more attempts at other implementations were far be-yond the original scope of the project.

References

[1] I. Buck. GPU computing with NVIDIA CUDA. International Conferenceon Computer Graphics and Interactive Techniques, 2007.

[2] P. Colantoni, N. Boukala, and J. Da Rugna. Fast and accurate colorimage processing using 3d graphics cards. Proceedings Vision, Modelingand Visualization 2003, 133, 2003.

[3] J. Kuang, G.M. Johnson, and M.D. Fairchild. iCAM06: A refined imageappearance model for HDR image rendering. Journal of Visual Commu-nication and Image Representation, 18(5):406–414, 2007.

[4] S. Paris and F. Durand. A Fast Approximation of the Bilateral Filterusing a Signal Processing Approach. 2006.

7