multiple-cost constraints for the design of tree-structured vector quantizers

824 lEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 6, JUNE 1995

Correspondence

Multiple-Cost Constraints for the Design of Tree-Structured Vector Quantizers

Jianhua Lin

Abstract-Minimizing the distortion subject to a cost constraint is fundamental in the design of tree-structured vector quantizers. Because of various competing cost measures, the use of single-cost constraints has led to undesirable results. We study the relationships among several cost functions and show how multiple-cost constraints can be used to significantly improve tree design.

I. INTRODUCTION Vector quantization (VQ) achieves lossy data compression by

mapping each block of input parameters to one of the code vectors in a predetermined codebook. The key in VQ is the design of a codebook, which involves the partitioning of a multidimensional space into a finite number of regions. Each region is assigned a code vector, and every vector within a region is quantized to the code vector of the region. For a given input space, it is desirable to choose a codebook that minimizes the expected distortion subject to some cost constraint. Finding such an optimal codebook is generally difficult or provably intractable in some cases as shown by Lin [4].

Tree-strucho-ed vector quantization (TSVQ) is a technique to reduce the complexity of VQ by imposing a structural constraint on the quantizer [2], [9]. The partitioning is required to have a tree structure that can be used for fast quantization search through a root- to-leaf path. Tree-structured vector quantizers can also be designed to minimize the expected distortion subject to a cost constraint. Various cost functions can be defined for a TSVQ to measure different costs such as storage, lossless encoding rate, or quantization time. Unfortunately, the problem of optimal design has been shown to be intractable in most cases [4], [6], and heuristic techniques have to be used.

One of the most commonly used heuristics is based on the successive partitioning of an input space [2], [9]. The key to the technique is the determination of the specific subregion at each subdivision. Various selection criteria have been proposed in the past [9]-[ 1 11. A tree pruning technique has also been developed in which a large initial tree is constructed first and then pruned back according to some optimization criterion [3].

Most of the design algorithms considered in the past are based on constraints of single cost functions. Chou et al. [3] defined a family of meaningful cost functions that can be used as constraints. It is necessary to determine which cost measure is to be used in a particular design. The use of a specific cost function as the constraint may lead to trees that have low cost in terms of the constrained measure but higher cost in terms of other measures. This is particularly problematic when the tree is designed based on training data and used to quantize data outside of the training set. Overspecialization

Manuscript received August 26, 1992; revised June 23, 1994. The associate editor coordinating the review of this paper and approving it for publication was Prof. Dr.-Ing. Bemd Girod.

The author is with the Department of Computer Science, Eastern Connecti- cut State University, Willimantic, CT 06226 USA.

IEEE Log Number 941 1 134.

to the training samples may reduce the performance of the quantizers obtained.

Although the different cost measures reflect different costs of a TSVQ, some of them are also related. In this correspondence, we analyze the relationships among several common cost functions and consider how multiple costs can be incorporated in the design of TSVQ. It is possible to formalize the problem of optimal tree design to minimize the expected distortion subject to multiple cost constraints. The optimization problems are, however, difficult to solve except for some special cases.

The solution we propose is to combine multiple cost measures into one cost function. The combined cost measure satisfies the same properties of the single cost functions involved, and all the previously developed design algorithms can be used. It is also possible to associate a weight with each cost involved to reflect its relative importance. Experimental results in image compression show that the new design improves significantly the performance of existing algorithms.

II. TREE-STRUCTURED VECTOR QUANTIZATION In TSVQ, an input space is partitioned into a hierarchy of regions.

The advantage of such a hierarchical partitioning is that the structure can be represented in a tree. Each leaf in the tree represents a region in the partitioning, and each internal node represents the union of its children’s regions. In general, there could be any number of partitions at each level of the hierarchy. In practice, binary partitions are used most widely, and the corresponding structure is a binary tree. Our discussion will thus be focused on binary TSVQ, although the ideas apply to other tree structures as well. We will use the usual terminology for binary trees (root, left and right children, parent, ancestor, descendant, leaf, path, etc.), as defined in any relevant textbook.

Suppose Z is the random vector of the input space, and f(Z) is its probability density function. There is a discrete probability space associated with each binary quantization tree. If a node t represents a region S, then its probability is

r p ( t ) = f(Z) dZ.

/%S

The conditional probability of a node t with respect to its ancestor s can also be defined as p ( t l s ) = [ p ( t ) / p ( s ) ] .

As is true for any quantization scheme, when an input vector Z is quantized as c, a quantization error usually results and can be measured by using some defined distortion measure d(Z, y’). We are not concerned with what specific distortion measure is to be used in this study, and we assume that a measure d is given. Let Y; be the code vector at a node t and St be the region it represents. The distortion D ( t ) of node t is defined as

If T is a quantization tree of k leaves 1 1 , 1 2 , . . . , l k , then the average distortion of T is the expected distortion over all leaves.

k

D ( T ) = p ( Z i ) D ( l i ) . i = l

1057-7149/95$04.00 0 1995 IEEE

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 6, JUNE 1995 825

The primary goal in the design of tree-structured vector quantizers is to minimize the expected distortion. There is, however, also a cost associated with each quantization tree. If T is a quantization tree of k leaves 11 . 1 2 . . . . . l k , some commonly used cost functions are defined as follows:

In general, for a selection criterion to be useful, it must be related to the distortion or the cost of the tree and should be easy to compute. A class of useful optimization criteria can be derived from the framework of distortion-rate theory [ 11. For a given input space, every tree partitioning S gives an expected distortion D( S ) and an expected cost C ( S). The operational distortion-cost function for TSVQ was defined by Chou et al. [3] as

1) the number of leaves in the tree expressed by

D ( C ) = iiff { D ( S ) l C ( S ) 5 C}.

which measures the size of the tree and thus the storage cost 2) the index entropy of the leaves defined as

The heuristic of successive partitioning can be used to compute this operational distortion-cost function. Let S be a quantization tree

into 11 and l 2 and denote the resulting tree S(1), then k with distortion D ( S ) and cost C ( S ) . If we further split a leaf 1 of S

C ( T ) = - P ( 1 , ) log P ( L ) . (3)

which measures the minimum number of bits required to losslessly encode the indices to the leaves

3) the expected depth of the leaves expressed by

1=1

D [ S ( l ) ] = D ( s ) -P(~)D(~)+P(~~)D(~~)SP(~:!)~(~Z).

and

k

The slope of the distortion-cost function at this point is 1=1

which measures the average number of steps needed to quantize an input vector and thus the average quantization time. The D ( s ) - D [ S ( l ) ] - - P ( l ) D ( l ) - P ( l l ) D ( l ~ ) - P ( ~ z ) D ( ~ z ) depth of a leaf 1, which is denoted by depth ( Z ) , is the number of edges in the path from the root to 1 .

C [ S ( I ) ] - C ( S ) P( l I )C( l , ) + P ( l Z ) C ( h ) - P ( l ) C ( l ) - A D ( l ) - X ( l ) .

4) The maximum depth of the tree AC(1)

(5)

which measures the maximum number of steps required to quantize an input vector. This bounds the maximum delay for the quantization of each input vector.

Rate-distortion theory is the study of the relationship between the expected distortion and the transmission rate [l]. In our treatment of TSVQ, the relationship is between the expected distortion and a more general cost function. The fundamental problem is how to achieve minimal distortion for a given cost constraint, or vice versa, how to achieve minimal cost for a given distortion constraint. When the number of leaves is used as the cost function, we call the system size- constrained design. Similarly, we have entropy-constrained design for the entropy cost and depth-constrained design for the expected depth cost. Most of the existing design techniques are based on such single-cost constraints.

For a given input space, suppose I.' is the set of all tree partitions for the space. Each partition T E V has an expected distortion D ( T ) and a cost C ( T). For a given cost constraint c, a tree partition S E 1. that satisfies C ( S ) 5 c and

D ( S ) 5 D ( T ) fora l lC(T) 5 c. T E T.'

is an optimal tree for the constraint c. An optimal tree has minimum expected distortion among all trees satisfying the cost constraint.

It is, in general, difficult to construct an optimal tree for a given cost constraint. For most cost measures, the optimization problem has been shown to be NP-hard [4], 161. The design of tree-structured vector quantizers has thus been based mainly on heuristic approaches. One commonly used technique is to successively partition the input space into two regions, seeking an optimal bipartitioning that has minimum expected distortion at each subdivision [2], [9]. The key issue in this heuristic is to determine the order in which the subregions are partitioned. One simple approach is to partition the regions to a uniform depth [2]. Another is to partition the node that has the largest distortion at each subdivision [9]. Other selection criteria based on the distortion and cost measures have also been proposed in [lo]-[12].

A sensible selection criterion is to split a leaf that gives the largest value of X ( 1 ) . If we draw the distortion D as a function of the cost C as the tree grows, D is typically monotonically decreasing. The slope of D ( C ) indicates the rate at which the distortion decreases as the cost increases.

The above splitting criterion has been used with specific cost functions [2], [9], [l 11. It can also be extended to have more than one level of leaves as discussed in the technique of tree pruning [3] and in tree growing with multiple-level lookahead 151, 1121. The focus of this correspondence is on the choice of cost functions as the design constraints, and we will simply use the criteria based on one level split. It is known that better compression performance can generally be obtained with the more sophisticated splitting criteria.

111. DESIGN WITH MULTIPLE COST CONSTRAINTS

The various cost functions measure different computational re- sources required for a TSVQ. The existence of multiple cost measures makes it necessary to determine which one is to be used as the constraint in a particular design. In most applications, it is often desired to minimize the expected distortion subject to a rate constraint as studied in the rate-distortion theory. The leaf entropy of a quantization tree provides a first-order estimate of the encoding rate. The selection criterion based on this cost for the tree growing heuristic would be

- AD(1)

- P ( l ) A D ( l )

- -P(l1lC 1% P ( l l ) - P(l2ll) log P ( l 2 ) + 1% P ( l )

P ( l ) l O g P ( l ) - P ( l d l O ~ P ( ~ l ) - - P ( ~ Z ) logP(Z2)' -

To see the typical performance of such an entropy-constrained tree design, we have performed experiments in image compression. The tree-growing heuristic is used on a training set of 16 grey-scale images

826 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 6, JUNE 1995

0.12

0 . 1

0 . 0 8

0 .06

0 . 0 4

0 .02

1

Size-constrained design - En t r OPY - cons t r a ined design --t - -

0.15 0.2 0.25 0.3 Entropy Rate/Pixel

Fig. 1. Distortion-entropy performance for tree size and entropy constraints.

of human faces, animal, truck, jet, boat, house, and other similar scenes. All the images are of size 512 x 512 with 256 grey levels. Vectors of 4 x 4 blocks of pixels were used in the design. The k- means algorithm [7], [8] with L = 2 is used for bipartitioning. The performance of the quantizers is based on their compression of a test image (the w o m n with a hat), which is not part of the training sequence.

Fig. 1 shows the results of the entropy-constrained design in terms of the distortion versus the first-order entropy of the compressed image measured in bits per pixel. The normalized MSE is the mean- square error of the compressed image divided by the variance of the original image. Signal-to-noise ratio (SNR) can be obtained by taking the common logarithm of the inverse normalized MSE.

The compression results of the size-constrained design where the heuristic selection criterion is

(7)

are also presented in Fig. 1. Clearly, the entropy-constrained design offers slightly lower distortion than the size-constrained design at the same entropy. This small gain is, however, achieved at a much higher storage cost as shown in Fig. 2, where the distortion-size compression performance is presented for the two design methods. The trees obtained from the entropy-constrained design are much larger than those obtained from size-constrained design to achieve about the same distortion. For instance, compare the tree with 21° leaves from entropy-constrained design and the one with 25 leaves from size- constrained design. It is not possible for entropy-constrained design to achieve low distortion because of the large memory requirement.

From these experimental results, it is clear that the leaf entropy and the number of leaves may be competing cost measures. Reducing one cost may drive the other cost higher. The number of leaves in a tree provides an upper bound to the leaf entropy:

k

i=l

There does not exist a lower bound for the leaf entropy in terms of the number of leaves.

Entropy-constrained design -+-.

W g 0.1 -

8 m

a

.r( 4

+ - _ _ - '. - g 5 0 . 0 5 -

0 4 8 16 32 64 128 256 5121024 4096

Number of Leaves Fig. 2. Distortion-size performance for tree size and entropy constraints.

The same is true between the leaf entropy and the expected depth of a tree. If we take the difference of the two cost functions, we have

1- 1-

i = l i=l 1- k

= - p ( l , ) log p ( 2 , ) - p(Z,)(-log 2-depth(1,) i=l 2 = 1

since xF=l 24epth(l1) = 1 for any tree, from one of the well- known properties of entropy, the previous difference is always smaller than or equal to zero. We thus have an upper bound for the entropy cost in terms of the expected depth. There exists, however, no lower bound. The same relationship exists between the expected depth and the maximum depth of a tree as well.

Because of the contending nature of various cost measures, the design of TSVQ based on constraints of single cost measure may lead to trees that have low cost in terms of one measure but higher cost in terms of other measures. This may not be desirable in practice, especially when the design is based on a set of training samples. Overspecialization to the training data for one cost measure could reduce the performance for data outside of the training set.

One of the solutions to the problem is to include multiple cost constraints in the design. The distortion-cost function defined in (6) can be easily extended to a function of multiple cost measures. Let C1

and C2 be two cost functions on quantization trees. The dual-variate distortion-cost function for tree-structured VQ is defined as

for every tree partitioning S. It is, in general, even harder to construct a tree that achieves D(C1, C 2 ) for given constraints C1 and C2. It is also difficult to incorporate multiple cost constraints in the existing heuristic design algorithms.

In the special case that one of the two constraints is the maximum tree depth, however, two constraints can be easily accomplished in most of the heuristic design techniques. For the heuristic tree- growing algorithm, for instance, the satisfiability of the maximum depth constraint can be checked easily at each successive partitioning. In fact, the maximum depth constraint is implied in the pruning heuristic of Chou et al. [3] when the initial tree is complete.

To see the effect of this additional constraint on the tree depth, we have applied it to the entropy-constrained tree growing design on the

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 6, JUNE 1995 827

TABLE I COMPARISVN VFTSVQ DESIGN BASED ON SINGLE AND MULTIPLE COST CONSTRAINTS

Entropy Constraint Alone (1 Entropy + Maximum Depth Constraint Number of [ Entropy I Normalized I/ Number of 1 Entropy I Normalized

same image data as used earlier. The specific constraint used in this experiment is a maximum depth of 13. The compression results for the test image are presented in Table I together with those obtained from entropy-constrained design.

It can be seen from Table I that for trees of small size, the depth constraint has no effect because depth 13 has not been reached. When the tree grows larger, the additional constraint controls the tree size significantly with little loss in compression performance. For example, if we compare the tree with 4096 leaves obtained by entropy constraint alone and the one with 1024 leaves constructed with the additional depth constraint, both achieve virtually the same distortion at the same entropy, but the former has four times as many leaves as the latter.

IV. COMBINED-COST CONSTRAINTS The benefit of having multiple cost constraint is clear from the

above results. Unfortunately, it is difficult to do even for two different cost functions in general. A solution we propose is to combine cost functions together and use as one constraint. We use a weighted summation to combine the cost functions involved. The idea is that this allows the consideration of any number of cost measures and the relative importance of each cost as reflected by its weight. In the case of two cost functions, for example, C1 and C,, the combined cost is

c'= 6C1+ (1 - 6)C2

where 0 <_ h 5 1 is a real value. In the extreme case when h is zero or one, the combined cost is reduced to just single cost.

It is important to note that if the original cost functions are tree function& that satisfy the properties of recursiveness, positiveness, and monotonicity [3], then the combined cost function clearly has these properties as well. This means that the various design algorithms for the single-cost constraint can be directly used for a combined-cost constraint.

As an example, we applied a combined cost of the leaf entropy and tree size to the same heuristic tree growing design and image data as used before. We treated the two costs as being equally important, that is, 6 = 0.5. Fig. 3 shows the compression results for the test image in terms of the normalized distortion and the entropy together with results from size-constrained design and entropy-constrained design. It can be seen that the distortion-entropy performance of the combined size and entropy-constrained design is better than that of size-constrained design and very close to that of entropy-constrained design. In fact, the design with combined-cost constraint performs even better at a number of places than the design with just the entropy constraint. This can happen because the performance shown is based

0 . 1 2 , I Size-constrained design -

Entropy-constrained design -+--

Combined size & entropy - 0 . -

0 . 0 6 -

Entropy Ratelpixel

Fig. 3 . Distortion-entropy of combined, size, and entropy constraints

0.15 Size-constrained design

Entropy-constrained design -+-

Combined size & entropy - a - -

W gj 0.1

0 ' " " " ' " ' 4 8 1 6 32 6 4 1 2 8 2 5 6 512 1 0 2 4 4 0 9 6

Number of Leaves

Fig. 4. Distortion-size of combined, size, and entropy constraints.

on the compression of a test image outside of the training data for the design. It is known that entropy optimization based on a training set may lead to overspecialization, and the performance for data outside of the training set may not be as good. The use of multiple cost constraints is a way to address the problem.

The distortion-size performance of the design based on the combined-cost constraint is also remarkable compared with that of single-cost constraint as shown in Fig. 4. The curve stays very close to that of size-constrained design and much better than that of entropy-constrained design. These results show that the design based on a combined-cost function can minimize the expected distortion while keeping the cost low in terms of more than one measure. We believe that such an approach is very useful in practical tree-structured vector quantizer design.

V. SUMMARY

We have considered the design of tree-structured vector quantizers to minimize the expected distortion subject to a cost constraint. Since numerous cost measures can be used, design based on single-cost constraints has led to trees that have low cost in terms of one measure but high cost in terms of another. We have studied the relationship of various cost functions and developed a scheme to incorporate multiple cost measures in the design. Experimental results in image compression showed that the technique improves the performance of the current design significantly.

828 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 6, JUNE 1995

REFERENCES

T. Berger, Rare Distortion Theory: A Mathematical Basis for Data Compression. A. Buzo, A. H. Gray, Jr., R. M. Gray, and J. D. Markel, “Speech coding based upon vector quantization,” IEEE Trans. Inform. Theory, vol. IT-28, no. 5, pp. 562-574, 1980. P. A. Chou, T. Lookabaugh, and R. M. Gray, “Optimal pruning with applications to tree-structured source coding and modeling,” IEEE Trans. Inform. Theory, vol. 35, no. 2, pp. 299-315, 1989. J. Lin, “Vector quantization for image compression: Algorithms and performance,” Ph.D. dissertation, Brandeis Univ., Waltham, MA, 1992. J. Lin and J. A. Storer, “Design and performance of tree-structured vector quantizers,” Inform. Pmcessing Mgmt., vol. 30, no. 6, pp. 85 1-862, 1994. J. Lin, J. A. Storer, and M. Cohn, “Optimal pruning for tree-structured vector quantization,” Information Processing Mgmr., vol. 28, no. 6, pp.

Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Trans. Commun., vol. 28, pp. 84-95, 1980. S. P. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inform. Theory, vol. IT-28, no. 2, pp. 129-136, 1982. J. Makhoul, S. Roucos, and H. Gish, “Vector quantization in speech coding,” Pmc. IEEE, vol. 73, pp. 1551-1588, 1985. K. L. Oehler, E. A. Riskin, and R. M. Gray, “Unbalanced tree-growing algorithms for practical image compression,” in Pmc. IEEE ICASSP,

E. A. Riskin and R. M. Gray, “A greedy tree growing algorithm for the design of variable rate vector quantizers,” IEEE Trans. Signal Pmcessing, vol. 39, no. 11, pp. 2500-2507, 1991. -, “Look ahead in growing tree-structured vector quantizers,” in Proc. IEEE ICASSP, 1991, pp. 2289-2292, vol. 4.

Englewood Cliffs, NJ: Prentice-Hall, 197 1.

723-733, 1992.

1991, pp. 2293-2296, vol. 4.

Extended Lapped Transform in Image Coding

DCT IV

QO DCT IV e:::- 2-2 2-1

Fig. 1. branch carries M / 2 samples.

Flow graph for the direct (top) and inverse (bottom) ELT. Each

capabilities and reduction of blocking artifacts commonly present in block transform coding at low bit rates. Furthermore, the concept of lapped transforms was established and proven to be equivalent to the concept of paraunitary FIR uniform filter banks [3], [4]. Under this point of view, both the LOT and the DCT are considered as special choices of paraunitary filter banks [3], [4]. Cosine modulated filter banks [4] allow perfect reconstruction (PR) in paraunitary analysis- synthesis systems, using a modulation of a lowpass prototype by a cosine train. By a proper choice of the phase of the modulating cosine, Malvar developed the modulated lapped transform (MLT) [5], which led to the so-called extended lapped transforms (ELT) [3], [6]. The ELT allows several overlapping factors, generating a family of PR cosine modulated filter banks. Other cosine-modulation approaches have also been developed (see, for example, [4], [7], and references therein) and the most significant difference among them is the lowpass prototype choice and the phase of the cosine sequence.

Let M and L be the number of channels and filters’ length, respectively, where, for the ELTs, L = 2 K M , and K is the overlap factor. The analysis filters (f,,,(n)) are time-reversed versions of the synthesis filters (g,(n)) as in any paraunitary filter bank (for m = 0, 1, ..., M - 1 and n = 0, 1, ..., L - 1). The ELT class is defined by [3], [6]

{ ( m + t) Ricardo L. de Queiroz and K. R. Rao g,(n) = fm(L - 1 - n) = h(n) cos

Abstract-A modulated lapped transform with extended overlap (ELT) is investigated in image coding with the objective of verifying its potential to replace the discrete cosine transform (DCT) in specific applications. Some of the criteria utilized for the performance comparison are re- constructed image quality (both objective and subjective), reduction of blocking artifacts, robustness against transmission errors, and filtering (for scalability). Also, a fast implementation algorithm for finite-length- signals using symmetric extensions is developed specially for the ELT with overlap factor 2 (ELT-2). This comparison shows that ELT-2 is superior to both DCT and the lapped orthogonal transform (LOT).

I. INTRODUCTION While block transforms became very popular in the image coding

field, the lapped orthogonal transform (LOT) [ 11 arose as a promising competitor to such transforms as the discrete cosine transform (DCT) [2], which is the block transform used in most image and video coding algorithms [2]. The advantage of lapped transforms [3] resides on the length of their basis functions, providing improved filtering

L - 1 ?T . [(‘-1> $ p v + ’ ) q } 2 ( 1 )

for m = 0, l . . . , M - 1 and n = 0, 1, . . . , L - 1. h(n) is a symmetric window modulating the cosine sequence and the impulse response of a lowpass prototype (with cutoff frequency at 7r/2M) which is translated in the frequency domain to M different frequency slots in order to construct the uniform filter bank. We will use ELT with K = 2, which will be designated as ELT-2, and assume row-column separable implementation of the transform. Therefore, one-dimensional (1-D) analysis of the transform implementation is sufficient for two-dimensional (2-D) applications.

The lattice-style algorithm [3] is shown in Fig. 1 for an ELT with generic overlap factor IC. The stages 0, contain the plane rotations and are defined by

Manuscript received July 12, 1993; revised June 8, 1994. This work was supported in part by the Conselho Nacional de Desenvolvimento Cientifico e Tecnolbgico (CNPa), Brazil, under Grant 200.804-90-1. The associate editor

C , =diag{cos(Bo,n). ..., COS (o(M/2)-1,n)}\

S, = diag {sin(&, ,), . . . . sin ( 8 ~ / 2 - 1 . n ) } - . coordin2ing the review of this paper and approving it for publication was Prof. Nasser M. Nasrabadi.

me authors are with the Electrical Engineering Department, University of Texas at Arlington, Arlington, TX, 76019 USA.

where J is the M / 2 counter-identity (reversing) matrix [31. Also, Rt, are rotation angles such that On is of the form indicated in Fig. 2, containing M / 2 orthogonal butterflies. We use the optimized

IEEE Log Number 9411133. angles presented in [3].

1057-7149/95$04.00 0 1995 IEEE

multiple-cost constraints for the design of tree-structured vector quantizers

Documents