image reconstruction based on back propagation learning in...
TRANSCRIPT
Image reconstruction based on back propagation
learning in Compressed Sensing theory
Gaoang Wang
Project for ECE 539
Fall 2013
Abstract
Over the past few years, a new framework known as compressive sampling has been developed
for simultaneous sampling and compression. It can significantly reduce the number of
measurements required for a given signal in traditional compression methods. One way to increase
the compression ratio during the sampling is that we can apply different compression ratio to
different part of the image. Thus we can increase the compression ratio to the background of an
image in order to increase the total compression ratio. However, we know nothing of the original
image before we start sampling. So we can use sampling data to judge which part of the image
belongs to the background then we apply second-time sampling to these parts of the image.
Before sampling an image, we should use a lot of images to be the training data to compute the
weights of the classification. Then in the second-sampling, we use these weights to decide which
part belongs to the background. In the first-sampling, I apply the algorithm of block based
Compressed Sensing and use the sampling data as feature vectors. In the construction step, I use
OMP method and DCT matrix to reconstruct the image.
Content
Introduction ............................................................................................................................... 4
Compressed Sensing .................................................................................................................. 5
2.1 Background ..................................................................................................................... 5
2.2 Algorithm of SPL-BCS ....................................................................................................... 5
MLP and Back-Propagation ...................................................................................................... 7
3.1 Introduction of Back-Propagation ................................................................................... 7
3.2 Traning method in image reconstruction ........................................................................ 7
3.2.1 Feature vectors in training ........................................................................................ 7
3.2.2 BP learning program ................................................................................................. 9
3.2.3 Sampling testing data ............................................................................................. 11
3.2.4 Image reconstruction algorithm .............................................................................. 13
Results ..................................................................................................................................... 14
4.1 The confusion rate of BP learning ................................................................................. 14
Introduction
Over the past few years, a new framework known as compressive sampling has been
developed for simultaneous sampling and compression. It can significantly reduce the
number of measurements required for a given signal in traditional compression
methods. Compressed sensing (CS), built upon the groundbreaking work by Candes et
al. [1] and Donoho [2], aims at exactly reconstructing the original signals while
sampling at sub-Nyquist rate. Unlike traditional theories, CS theory greatly reduces
the signal sampling rate, signal processing time, data storage and transmission costs,
leading signal processing into a new revolutionary era. Due to its great practical
potentials, CS has been intensively studied and used both in academia and industries
in the past few years [3, 4]. The field of CS is related to other topics in signal
processing and computational mathematics, such as underdetermined linear-systems,
group testing, heavy hitters, sparse coding, multiplexing, sparse sampling, and finite
rate of innovation. Imaging techniques having a strong affinity with CS include coded
aperture and computational photography.
There are many algorithms of image reconstruction based on CS, like
block-based CS sampling (BCS) [5]. It is a quite efficient method, which can solve
the artifact problem among block edges. Before sampling since we know little about
the original images, so few algorithm can take consideration of image characteristics
during the reconstruction. As we all know, most parts of national images are smooth.
Therefore we could take a high compression ratio in the sampling. However, since
some parts of images have complicated texture, these parts of image can be hardly
reconstructed well with a high compression ration. Thus we have to reduce the total
compression ratio even if most parts of image are very smooth. Fortunately, we can
use some learning method in the training data. In this way, we will know which parts
of image are smooth than other parts after sampling. Then we could sample these
parts of image in a second time with a higher ratio. Therefore the total compression
ratio will increase.
Compressed Sensing
2.1 Background
Consider a real-valued, N-length, one-dimensional, discrete-time signal x, which can
be viewed as an N × 1 column vector in RN with elements x[n], n = 1, 2, . . . , N.
Suppose that we are allowed to take M (M<<N) linear non-adaptive measurement of x
through the following linear transformation [1, 2]:
y=Фx, (1.1)
where y represents an M × 1sampled vector and Φ is an M × N measurement matrix.
Since M<<N, the reconstruction of x from y is generally ill-posed. However, the
CS theory is based on the fact that x has a sparse representation in a known transform
domain Ψ. In other words, the transform-domain signal f = Ψx can be well
approximated using only d<M<<N non-zero entries. It was proved in [1, 2] that when
Φ and Ψ are incoherent, x can be well recovered from M measurements. In the study
of CS, a couple of the most important issues include: (a) the design of measurement
matrix Φ; (b) the selection of transform Ψ; (c) the reconstruction algorithm.
Random matrix is always selected as measurement matrix since incoherence can
be achieved with a high probability. As for transform, there are DCT, wavelet,
grouplet, bandlet and curvelet [6], Dual-tree discrete wavelet transform (DDWT) [7],
contourlet [8] and so on. For the reconstruction methods, orthogonal matching pursuit
(OMP) and basis pursuit (BP) are classical ones. For 2D images, another well known
reconstruction algorithm is through the minimization of total variation (TV) [9]. Other
algorithms include iterative soft-thresholding and projection onto convex sets. [10]
2.2 Algorithm of SPL-BCS
In BCS, an image is divided into B × B blocks and sampled using an
appropriately-sized measurement matrix. That is, suppose that xj is a vector
representing, in raster-scan fashion, block j of input image x. The corresponding yj is
then yj =ΦBxj , where ΦB is an MB × B2 orthonormal measurement matrix with MB
=(M/N)B2.
Using BCS rather than random sampling applied to the entire image x has
several merits [11]. First, the measurement operator ΦB is conveniently stored and
employed because of its compact size. Second, the encoder does not need to wait until
the entire image is measured, but may send each block after its linear projection. Last,
an initial approximation x(0)
with minimum mean squared error can be feasibly
calculated due to the small size of ΦB [11].
In [11], Wiener filtering was incorporated into the basic PL framework in order
to remove blocking artifacts. In essence, this operation imposes smoothness in
addition to the sparsity inherent to PL. Specifically, in [11], a Wiener-filtering step
was interleaved with the PL projection of (2)–(3); thus, the approximation to the
image at iteration i + 1, x(i+1)
, is produced from x(i)
as:
Here, Wiener(∙) is pixelwise adaptive Wiener filtering using a neighborhood of 3 × 3,
while Threshold(∙) is a thresholding process as discussed below. In our use of SPL, we
initialize with x(0)
= ΦTy and terminate when |D
(i+1) − D(i)
| < 10−4
, where
( ) ( ) ( 1)
2
1 ˆ̂|| ||i i iD x xN
( +1) ( )
B
( ) ( )
( ) ( ) ( )
B B
( ) ( ) ( )
( ) ( )
( ) ( ) ( )
function = SPL( , , , , )
= Wiener( )
For each block
ˆ̂ ˆ ˆ ( )
ˆ̂
( , )
ˆ
i i
1 2
i i
i i T i
j j 1 j j
i i i
i i
i i i
x x y p p
x x
j
x x p y x
x D x
x Threshold x
x D x
( ) ( ) ( )
B B
For each block
( )i i T i
j 2 j j
j
x x p y x
MLP and Back-Propagation
3.1 Introduction of Back-Propagation
Multilayer perceptrons have been applied successfully to solve some difficult and
diverse problems by training them in a supervised manner with a highly popular
algorithm known as the error back-propagation algorithm. This algorithm is based on
the error-correction learning rule. As such, it may be viewed as generalization of an
equally popular adaptive filtering algorithm: the ubiquitous least-mean-square (LMS)
algorithm for the special case of a single linear neuron model.
Basically, the error back-propagation process consists of two passes through the
different layers of the network: a forward pass and a backward pass. In the forward
pass, an activity pattern (input vector) is applied to the sensory nodes of the network,
and its effect propagates through the network, layer by layer. Finally, a set of outputs
is produced as the actual response of the network. During the forward pass the
synaptic weights of the network are all fixed. During the backward pass, on the other
hand, the synaptic weights are all adjusted in accordance with the error-correction rule.
Specifically, the actual response of the network is subtracted from a desired (target)
response to produce an error signal. Then this error signal is propagated backwards
through the network, against the direction of synaptic connections-hence the name
“error back-propagation”. The synaptic weights are adjusted so as to make the actual
response of the network move closer to the desired response.
3.2 Traning method in image reconstruction
I use a large amount of sampling blocks from images as training data. The outputs
determine whehter these blocks can bare a higher sampling ratio. After training, I take
the final weights into sampling. Given an original image, I take two times
measurement. In the first measurement, since we know nothing about the
characteristics of the image, we use a general lower compression ratio for all the
image blocks. Then the weights from training come into use. They decide whether
these parts of image can have a higher compression ratio. If satisfied, then these parts
will proceed a second sampling. With this method, the total compression ratio will
increase.
3.2.1 Feature vectors in training
In real time processing, we know nothing about the original image before sampling.
We could do something after the first sampling step. This requiresus that we could
only take sampling data as traning data in BP learning method. We know that
sampling data is ΦB∙v , where ΦB is a Gaussian random matrix and v is a vectorized
block from original image. Since ΦB is a random matrix, if we take ΦB∙v as feature
vectors, the BP learning can hardly work. Therefore, I take two measurement as
pre-processing:
(1) Fix measurement matrix. In the whole process, I use identical random
matrices in each sampling step. If sampling a new image, I won’t
generate a new random matrix again.
(2) To reduce the randomness of the measurement matrix, I times the pseudo
inverse of the measurement matrix, i.e. I take ΦB -1
∙ΦB∙v as the feature
vectors.
In this project, the training data comes from 10 images (256 × 256). Since it will
save the computing time with gray level images. Therefore all these 10 images are
gray level. The block size is 8 × 8. Thus for each image, there are 2562/8
2=1024
training data. So there are 10240 training data in total. The size of MB is 16 × 64, i.e.
equals 4 compression ratio. The training images are given as below:
Each image is 256 × 256.
3.2.2 BP learning program
As mentioned above, each feature vector is 64 × 1. For decreasing the computing
comlexity. Each time, I random select 1024 vectors from 10240 vectors as training
data. I run the training program hundreds of times. I use 3 layer and 4 layer separately.
In addition, I set the Epoch=1000, μ=0.8 and η=0.01. For the first training, I use the
random values as the initial weights. For the times afterwards, I use the weights
generated from last time as the initial weights. The confusion rate is ploted as below:
3 layer
4 layer
From the diagram we can see the variance of rates of 4 layer configuration is much
higher than 3 layer configuration.
3.2.3 Sampling testing data
After BP learing algorithm, we get the weights of MLP. This step we will use these
weights to deal with the testing data. I use 12 images (512 × 512) as the testing
example. All these 12 images are shown below:
Each Image is 512 × 512. From upper left to lower right, we denote the images as 1 to 12.
In the first step, we divide each image into block of size 8 × 8. Then we sample each
block with compression ratio of 4. In this step, we don’t need to consider the
characteristics of images. When having obtained the sampled images, we times the
weights and then we know which parts of images need to take a second step of
compressed sensing.
In the second step, all the satisfied blocks are resampled by the measurement matrix
of 4 × 64, i.e. the compression ratio is 16. Thus the total compression ratio will
increase.
3.2.4 Image reconstruction algorithm
In the reconstruction, we apply the SPL-BCS algorithm by James E. Fowler. Since in
block based CS, it is easy to have artifacts among block edges in the final
reconstruction image. In this algorithm, we use Wiener filter to remove blocking
artifacts. The pseudocodes are given below:
Since there are two kinds of ΦB in the algorithm (one has the size of 16 by 64, and
another is 4 by 64). So we should modify this algorithm into two parts. For different
parts of image we apply different measurement matrix to them. Then combine them
together.
( +1) ( )
B
( ) ( )
( ) ( ) ( )
B B
( ) ( ) ( )
( ) ( )
( ) ( ) ( )
function = SPL( , , , , )
= Wiener( )
For each block
ˆ̂ ˆ ˆ ( )
ˆ̂
( , )
ˆ
i i
1 2
i i
i i T i
j j 1 j j
i i i
i i
i i i
x x y p p
x x
j
x x p y x
x D x
x Threshold x
x D x
( ) ( ) ( )
B B
For each block
( )i i T i
j 2 j j
j
x x p y x
Results
4.1 The confusion rate of BP learning
If we apply the obtained weights to the testing examples. The confusion rates are
given below:
Image 3 layer
Con rate
4 layer
Con rate
1 0.8862 0.4082
2 0.8884 0.2363
3 0.6877 0.4980
4 0.8022 0.4341
5 0.8464 0.3984
6 0.8865 0.2979
7 0.8545 0.3831
8 0.7207 0.6763
9 0.7849 0.4666
10 0.7747 0.3599
11 0.7332 0.5967
12 0.7380 0.5818
We see the confusion rates in the form are not too high, especially for 4 layer
configuration. However, these rates are only the referrence since the label values I
give to the testing images are subjective. In other words, if I think this part of image is
smooth enough, then I give it the value 1. If not, I give 0 value. This is how the target
label comes from.
The most interesting part results are the total compression rates and the PSNR of
the reconstructed images. The results are shown below:
Image 3 layer BP 4 layer BP
PSNR Com.Rate PSNR Com.Rate
1 27.3083 14.3184 28.9963 6.2969
2 29.7727 14.4795 31.5096 4.3457
3 27.4235 13.7207 29.9105 4.9053
4 26.5394 13.9990 28.2385 5.6025
5 28.3924 14.7900 30.0847 6.5488
6 29.0541 13.9902 30.1757 4.6943
7 28.4538 14.2832 30.0008 5.5557
8 21.1964 10.7676 21.4562 4.8203
9 23.5239 12.7744 24.2369 4.2490
10 27.1238 14.2217 28.4484 4.2813
11 24.3417 13.1992 25.2604 6.4990
12 22.2190 12.4609 22.7841 5.4414
Image SPL-BCS
PSNR Com.Rate
1 26.2417 8
2 29.4186 8
3 27.1824 8
4 25.1452 8
5 27.8494 8
6 27.5375 8
7 26.8514 8
8 19.8811 8
9 23.2498 8
10 26.5758 8
11 22.7026 8
12 20.1181 8
We can see in the form that the algorithm with using 3 layer BP learing is efficient.
On one hand, it increase the total compression ratio. On the other hand, the
reconstructed images can get a higher PSNR than SPL-BCS. However, with 4 layer
configuration, since the confusion rates are very low (which have been shown in last
chapter), most smooth parts of images have been justified as unsmooth. So there are
few blocks take a second sampling, which leads to the low compression ratio. Since
the compression ratio is low, the PSNR is much higher than 3 layer configuration.
Furthermore, we can see the difference of using BP method and without BP method in
the reconstructed images.
The left-hand-side are reconstructed images using 3 layer BP learning and the right-hand-side are
images without BP learning.
From the reconstructed images, we find that the edges with BP learning are much
clear than the edges without BP learning.
Because of time limit, I haven’t compare much of the different MLP configures. In the
future research, I would find which configuration in MLP is better for image
reconstruction.
Referrence
[1] E. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal
reconstruction from highly incomplete frequency information,” IEEE Trans. Inform. Theory,
vol. 52, no. 2, pp. 489–509, Feb. 2006.
[2] D. Donoho, “Compressed sensing,” IEEE Trans. Inform. Theory, vol. 52, no. 4, pp. 1289–
1306, Apr. 2006.
[3] Y.Tsaig and D. L. Donoho, “Extensions of compressed sensing,” Signal Processing, vol. 86,
pp.533-548, July 2006.
[4] D. L. Donoho, Y.Tsaig, I. Drori, and J.-L. Starck, “Sparse solution of underdetermined linear
equations by stagewise orthogonal matching pursuit,” Mar. 2006.
[5] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonalmatching pursuit: Recursive
function approximation with applications to wavelet decomposition,” in Conf. Rec. 27th
Asilomar Conf. Signals, Syst. Comput, vol.1, pp. 40 - 44, 1993.
[6] E. Pennec and S. Mallat, “Bandelet image approximation and compression,” Multiscale
Modeling & Simulation, vol.4, no. 4, pp. 992 - 1039, 2005.
[7] N. G. Kingsbury, “Complex wavelets for shift invariant analysis and filtering of signals,”
Journal of Applied Computational Harmonic Analysis, vol. 10, pp. 234–253, May 2001.
[8] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution
image representation,” IEEE Transactions on Image Processing, vol. 14, no. 12, pp. 2091–
2106, December 2005.
[9] E. Cand`es, J. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate
measurements,” Communications on Pure and Applied Mathematics, vol. 59, no. 8, pp. 1207–
1223, August 2006.
[10] E. Candes and J. Romberg, “Practical signal recovery from random
projections,”2005,[Online].Available:http://www.dsp.ece.rice.edu/CS
[11] L. Gan, “Block compressed sensing of natural images,” in Proceedings of the International
Conference on Digital Signal Processing, Cardiff, UK, pp.403-406, July 2007.