antoine grimaldi, david kane & marcelo bertalmío...

1
Scale invariance does not hold for high dynamic range images but is reestablished by early retinal nonlinearities [email protected] Antoine Grimaldi, David Kane & Marcelo Bertalmío Introduction In 1987, David J.Field noticed that the average power spectrum of natural images follows a power law. This feature was described to be a representation of the scale invariance property of natural images, i.e. natural image statistics remain unchanged regarding viewing distance. It was also related to the fractal nature of natural scenes. High Dynamic Range ( HDR ) images can provide more information ( DR = I max / I min ). Exposure time Merging the different images Tone mapping Tone mapped HDR image HDR image representation with log10 values log10 axis for power log10 axis for frequency (cycles/image) 10 0 10 1 10 2 10 3 10 1 10 2 10 3 10 4 10 5 10 6 100 < DR < 278 278 < DR < 589 589 < DR < 1234 1234 < DR < 5498 5498 < DR 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 DR a Examples of power spectra fitted with a first order and a second order polynomial Mean power spectra for the different dynamic range categories Fitting error for the polynomial function of dynamic range Value of the leading term for the second order polynomial fitting function of DR ( P(x) = ax2 + bx + c ) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 DR first order fitting error second order fitting error fitting error 10 0 10 1 10 2 10 1 10 2 10 3 10 4 10 5 original data first order fit second order fit frequency (cycles/image) 10 0 10 1 10 2 10 1 10 2 10 3 10 4 frequency (cycles/image) original data first order fit second order fit 10 0 10 1 10 2 10 0 10 1 10 2 10 10 3 10 4 10 5 original data first order fit second order fit frequency (cycles/image) DR = 678 DR = 5325 DR = 65138 10 1 10 2 10 3 10 4 10 5 10 6 10 7 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 DR a original image retinal image Fitting error for the polynomial function of dynamic range Value of the leading term for the second order polynomial fitting function of DR ( P(x) = ax2 + bx + c ) 10 1 10 2 10 3 10 4 10 5 10 6 10 7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 DR retinal 1st order fit retinal 2nd fit fitting error original 1st order fit original 2nd order fit Results References Discussion Mean power spectra for the different dynamic range categories after PSF Mean power spectra for the different dynamic range categories after PSF & NR The Point Spread Function (PSF) of an optical system is it’s response to a point source. Light scattering has been studied by Santamaría et al. in 1987: Photoreceptors of the human eye have a non-linear response to natural stimuli, the response curve of cones’ fish was studied in 1966 by Naka & Rushton: The shape of natural images power spectra seems to be affected by the dynamic range. For HDR images of the natural world, the scale invariance property may fail. As pointed out by Dror et al. it may be only due to the presence of an light source in the image. The specularity of these images may locate the energy in the low frequencies of the power spectrum. At the retinal image level, the 1/f rule is recovered, the non-linear response of the photoreceptors seems to play a role in the recovery of the scale invariance property. Future Work Log 10 spatial frequency ( cycles/picture ) Log 10 amplitude The human visual system (HVS) has evolved to cope with the natural world. Understanding the statistical regularities in the world can help us understanding the HVS and aid the design of image processing algorithms. Other studies reported the same behavior: Burton & Moorhead 1987, Tadmor & Tolhurst 1993, Ruderman 1994, Van Der Schaaf & Van Hateren 1995, Balboa & Grzywacz 2002 The optic of the human eye and the photoreceptor’s reponse alter the real-world illumination values to create the retinal image. The image dataset comes from equirectangular projections of the Southampton-York Natural Scenes 3D images and they present some artifacts. However, these artifacts don't seem to affect the statistics used here. Creation of 4 square images from 1 equirectangular representation Average power spectrum function of frequency, figure extracted from: Relations between the statistics of natural images and the response properties of cortical cells, D. Field, 1987 Representations of the PSF of the human eye, figure extracted from: Determination of the point- spread function of human eyes using a hybrid optical-digital method, J. Santamaria et al. , 1987 S-potentials of fish cones, figure extracted from: S-potentials from colour units in the retina of fish (Cyprinidae), Naka & Rushton, 1966 Only two studies focused on the statistics of HDR images: Dror et al., 2001 and Pouli et al. , 2010 Creation of an HDR natural images dataset with better control over the possible effects of the DR Creation of a synthetic HDR dataset created using a physically accurate rendering model field, d. j. (1987). Relations between the statistics of natural images and the response properties of cortical cells. JOSA A, 4(12). Santamaría, J., Artal, P., & Bescós, J. (1987). Determination of the point-spread function of human eyes using a hybrid optical–digital method. JOSA A, 4(6), 1109-1114. Naka, K. I., & Rushton, W. A. H. (1966). S-potentials from colour units in the retina of fish (Cyprinidae). The Journal of physiology, 185(3), 536-555. dror, r. o., leung, t. k., adelson, e. h., & willsky, a. s. (2001). Statistics of real-world illumination. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference On (Vol. 2, pp. II-II). IEEE. log10 axis for power log10 axis for frequency (cycles/image) 10 0 10 1 10 2 10 3 10 1 10 2 10 3 10 4 10 5 10 6 100 < DR < 278 278 < DR < 589 589 < DR < 1234 1234 < DR < 5498 5498 < DR point spread function filtering log10 axis for power log10 axis for frequency (cycles/image) 10 0 10 1 10 2 10 3 10 1 10 2 10 3 10 4 10 5 10 6 100 < DR < 278 278 < DR < 589 589 < DR < 1234 1234 < DR < 5498 5498 < DR naka-rushton equation applying

Upload: others

Post on 26-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Antoine Grimaldi, David Kane & Marcelo Bertalmío Introductionip4ec.upf.edu/system/files/publications/poster_ECVP.pdf · Statistics of real-world illumination. In Computer Vision

Scale invariance does not hold for high dynamic range images but is reestablished by early retinal nonlinearities

[email protected]

Antoine Grimaldi, David Kane & Marcelo Bertalmío

Introduction

In 1987, David J.Field noticed that the average power spectrum of natural images follows a power law.

This feature was described to be a representation of the scale invariance property of natural images, i.e. naturalimage statistics remain unchangedregarding viewing distance.

It was also related to the fractal nature of natural scenes.

High Dynamic Range ( HDR ) images can provide more information ( DR = Imax / Imin ).

Exposure time

Merging the different images

Tone mapping

Tone mapped HDR imageHDR image representation with log10 values

log1

0 ax

is fo

r pow

er

log10 axis for frequency (cycles/image)100 101 102 103

101

102

103

104

105

106

100 < DR < 278278 < DR < 589589 < DR < 12341234 < DR < 54985498 < DR

101 102 103 104 105 106 107-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

DR

a

Examples of power spectra fitted with a first order and a second order polynomial

Mean power spectra for the different dynamic range categories

Fitting error for the polynomial function of dynamic range

Value of the leading term for the second order polynomial fitting function of DR ( P(x) = ax2 + bx + c )

101 102 103 104 105 106 1070

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

DR

first order fitting errorsecond order fitting error

fittin

g er

ror

100 101 102101

102

103

104

105

original datafirst order fitsecond order fit

frequency (cycles/image)

100 101 102101

102

103

104

frequency (cycles/image)

original datafirst order fitsecond order fit

100 101 102100

101

102

10103

104

105

original datafirst order fitsecond order fit

frequency (cycles/image)

DR = 678 DR = 5325 DR = 65138

101 102 103 104 105 106 107-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

DR

a

original imageretinal image

Fitting error for the polynomial function of dynamic range

Value of the leading term for the second order polynomial fitting function of DR ( P(x) = ax2 + bx + c )

101 102 103 104 105 106 1070

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

DR

retinal 1st order fitretinal 2nd fit

fittin

g er

ror

original 1st order fitoriginal 2nd order fit

Results

References

Discussion

Mean power spectra for the different dynamic range categories after PSF

Mean power spectra for the different dynamic range categories after PSF & NR

The Point Spread Function (PSF) of an optical system is it’s response to a point source. Light scattering has been studied by Santamaría et al. in 1987:

Photoreceptors of the human eye have a non-linear response to natural stimuli, the response curve of cones’ fish was studied in 1966 by Naka & Rushton:

The shape of natural images power spectra seems to be affected by the dynamic range. For HDR images of the natural world, the scale invariance property may fail.

As pointed out by Dror et al. it may be only due to the presence of an light source in the image. The specularity of these images may locate the energy in the low frequencies of the power spectrum.

At the retinal image level, the 1/f rule is recovered, the non-linear response of the photoreceptors seems to play a role in the recovery of the scale invariance property.

Future Work

Log10 spatial frequency ( cycles/picture )

Log 10

ampl

itude

The human visual system (HVS) has evolved to cope with the natural world. Understanding the statistical regularities in the world can help us understanding the HVS and aid the design of image processing algorithms.

Other studies reported the same behavior: Burton & Moorhead 1987, Tadmor & Tolhurst 1993, Ruderman 1994, Van Der Schaaf & Van Hateren 1995, Balboa & Grzywacz 2002

The optic of the human eye and the photoreceptor’s reponse alter the real-world illumination values to create the retinal image.

The image dataset comes from equirectangular projections of the Southampton-York Natural Scenes 3D images and they present some artifacts. However, these artifacts don't seem to affect the statistics used here.

Creation of 4 square images from 1 equirectangular representation

Average power spectrum function of frequency, figure extracted from: Relations between the statistics of natural images and the response properties of cortical cells, D. Field, 1987

Representations of the PSF of the human eye, figure extracted from: Determination of the point-spread function of human eyes using a hybrid optical-digital method, J. Santamaria et al. , 1987

S-potentials of fish cones, figure extracted from: S-potentials from colour units in the retina of fish (Cyprinidae), Naka & Rushton, 1966

Only two studies focused on the statistics of HDR images: Dror et al., 2001 and Pouli et al. , 2010

• •

Creation of an HDR natural images dataset with better control over the possible effects of the DR

Creation of a synthetic HDR dataset created using aphysically accurate rendering model

field, d. j. (1987). Relations between the statistics of natural images and the response properties of cortical cells. JOSA A, 4(12).

Santamaría, J., Artal, P., & Bescós, J. (1987). Determination of the point-spread function of human eyes using a hybrid optical–digital method. JOSA A, 4(6), 1109-1114.

Naka, K. I., & Rushton, W. A. H. (1966). S-potentials from colour units in the retina of fish (Cyprinidae). The Journal of physiology, 185(3), 536-555.

dror, r. o., leung, t. k., adelson, e. h., & willsky, a. s. (2001). Statistics of real-world illumination. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference On (Vol. 2, pp. II-II). IEEE.

• •

log1

0 ax

is fo

r pow

er

log10 axis for frequency (cycles/image)100 101 102 103

101

102

103

104

105

106

100 < DR < 278278 < DR < 589589 < DR < 12341234 < DR < 54985498 < DR

point spread function filtering

log1

0 ax

is fo

r pow

er

log10 axis for frequency (cycles/image)100 101 102 103

101

102

103

104

105

106

100 < DR < 278278 < DR < 589589 < DR < 12341234 < DR < 54985498 < DR

naka-rushtonequation applying