crisp and fuzzy adaptive spectral predictions for lossless and near-lossless compression of...

5
532 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 4, NO. 4, OCTOBER 2007 Crisp and Fuzzy Adaptive Spectral Predictions for Lossless and Near-Lossless Compression of Hyperspectral Imagery Bruno Aiazzi, Luciano Alparone, Stefano Baronti, Member, IEEE, and Cinzia Lastri Abstract—This letter presents an original approach that exploits classified spectral prediction for lossless/near-lossless hyperspectral-image compression. Minimum-mean-square-error spectral predictors are calculated, one for each small spatial block of each band, and are classified (clustered) to yield a user-defined number of prototype predictors that are capable of matching the spectral features of different classes of pixel spectra for each wavelength. Such predictors are used to achieve a prediction, either crisp or fuzzy. Unlike most of the methods reported in the literature, the proposed approach exploits a purely spectral prediction that is suitable in compressing the data in band- interleaved-by-line format, as they are available at the output of the onboard instrument. In that case, the training phase, i.e., clustering and refining of predictors for each wavelength, may be moved offline. Experimental results on Airborne Visible InfraRed Imaging Spectrometer data show improvements over the most ad- vanced methods in the literature, with a computational complexity that is far lower than that of analogous methods by the same and other authors. Index Terms—Adaptive prediction, hyperspectral imagery, loss- less compression, near-lossless compression, onboard data com- pression, spectral decorrelation. I. I NTRODUCTION A CHALLENGE of satellite hyperspectral imaging is data compression for dissemination to users and, particu- larly, for transmission to the ground station from the orbiting platform. The data compression is composed of decorrelation of the correlated information source, which is possibly followed by quantization, to reduce transmission rates, and entropy coding of residues. To meet the quality issues of hyperspectral-image analysis, differential pulse code modulation (DPCM) is usually employed for lossless/near-lossless compression [1]. That is, the decompressed data have a user-defined maximum absolute error, which is zero in the lossless case. The DPCM is basically composed of a prediction followed by an entropy coding of quantized differences between original and predicted values. A unit quantization step size allows reversible compression to be achieved as a limit case. The simplest way to design a predictor, once a causal neighborhood, i.e., a set of pixels that have been previously scanned, is fixed, Manuscript received February 13, 2007; revised March 29, 2007. B. Aiazzi, S. Baronti, and C. Lastri are with the Istituto di Fisica Applicata “Nello Carrara,” CNR Area della Ricerca di Firenze, 50019 Sesto F.no (FI), Italy (e-mail: [email protected]; [email protected]; [email protected]). L. Alparone is with the Dipartimento di Elettronica e Telecomunicazioni, Università di Firenze, 50139 Firenze, Italy (e-mail: [email protected]fi.it). Digital Object Identifier 10.1109/LGRS.2007.900695 is to take a linear combination of the values of its samples. The coefficients of predictor can be calculated to yield a minimum- mean-squared error (mmse) over the whole image. Such a prediction is optimal only for stationary signals. Therefore, two strategies have been proposed to overcome this drawback and obtain an adaptive prediction. Adaptive DPCM (ADPCM), in which the coefficients of predictors are continuously re- calculated from the incoming new data, is traditionally used for 1-D signals [2]. A more recent DPCM typology that is suitable for digital images is the classified DPCM, in which a number of statistical classes of samples are preliminarily recognized, an optimized predictor is calculated for each class, and such predictors are switched, either hardly [3] or softly [4], to attain the best space-varying prediction. The two strategies of hard/soft classified prediction will be referred to as adaptive selection/combination of adaptive predictors (ASAP/ACAP). Whenever multiband images are to be compressed, an ad- vantage may be taken from the spectral correlation of the data in designing a prediction that is both spatial and spectral from a causal neighborhood of pixels [1]. Causal means that only previously scanned pixels on the current and previously encoded bands may be used to predict the current-pixel (CP) value. This strategy is more and more effective as the spectral correlation increases, as it happens with the hyperspectral data. State-of-the-art schemes feature 3-D prediction, i.e., jointly spatial and spectral, whose support comprises pixels belonging to the current and few previously scanned bands, one [5] or two at most [6]–[8]. The ACAP encoder [4] has been extended by the authors to 3-D data [9], in the same way as the 3-D ASAP encoder [6], by simply changing the 2-D neighborhood (spatial) [3] into a 3-D one spanning up to three previous bands. A different approach that is specific to hyperspectral images is the extension of 3-D CALIC [10], which was originally conceived for color images, having few spectral bands, to image data having a greater number of highly correlated bands. The method, which is referred to as M-CALIC [8], significantly outperforms the 3-D CALIC, to which it largely sticks, with a moderately increased computational complexity and absence of setup parameters that are crucial for performances. The nonstationarity characteristics of hyperspectral data in both spatial and spectral domains, together with the com- putational constraints, make the jointly spatial and spectral prediction take a negligible extra advantage from a number of previous bands greater than two [6]–[8]. However, if the original hyperspectral pixel vectors are classified into spatially homogeneous classes, whose map must be transmitted as side information, then a purely spectral prediction may be profitably carried out on pixel spectra belonging to each class by means 1545-598X/$25.00 © 2007 IEEE

Upload: cinzia

Post on 09-Dec-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Crisp and Fuzzy Adaptive Spectral Predictions for Lossless and Near-Lossless Compression of Hyperspectral Imagery

532 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 4, NO. 4, OCTOBER 2007

Crisp and Fuzzy Adaptive Spectral Predictions forLossless and Near-Lossless Compression of

Hyperspectral ImageryBruno Aiazzi, Luciano Alparone, Stefano Baronti, Member, IEEE, and Cinzia Lastri

Abstract—This letter presents an original approach thatexploits classified spectral prediction for lossless/near-losslesshyperspectral-image compression. Minimum-mean-square-errorspectral predictors are calculated, one for each small spatial blockof each band, and are classified (clustered) to yield a user-definednumber of prototype predictors that are capable of matchingthe spectral features of different classes of pixel spectra for eachwavelength. Such predictors are used to achieve a prediction,either crisp or fuzzy. Unlike most of the methods reported inthe literature, the proposed approach exploits a purely spectralprediction that is suitable in compressing the data in band-interleaved-by-line format, as they are available at the outputof the onboard instrument. In that case, the training phase, i.e.,clustering and refining of predictors for each wavelength, may bemoved offline. Experimental results on Airborne Visible InfraRedImaging Spectrometer data show improvements over the most ad-vanced methods in the literature, with a computational complexitythat is far lower than that of analogous methods by the same andother authors.

Index Terms—Adaptive prediction, hyperspectral imagery, loss-less compression, near-lossless compression, onboard data com-pression, spectral decorrelation.

I. INTRODUCTION

A CHALLENGE of satellite hyperspectral imaging is datacompression for dissemination to users and, particu-

larly, for transmission to the ground station from the orbitingplatform.

The data compression is composed of decorrelation of thecorrelated information source, which is possibly followed byquantization, to reduce transmission rates, and entropy codingof residues. To meet the quality issues of hyperspectral-imageanalysis, differential pulse code modulation (DPCM) is usuallyemployed for lossless/near-lossless compression [1]. That is,the decompressed data have a user-defined maximum absoluteerror, which is zero in the lossless case.

The DPCM is basically composed of a prediction followedby an entropy coding of quantized differences between originaland predicted values. A unit quantization step size allowsreversible compression to be achieved as a limit case. Thesimplest way to design a predictor, once a causal neighborhood,i.e., a set of pixels that have been previously scanned, is fixed,

Manuscript received February 13, 2007; revised March 29, 2007.B. Aiazzi, S. Baronti, and C. Lastri are with the Istituto di Fisica Applicata

“Nello Carrara,” CNR Area della Ricerca di Firenze, 50019 Sesto F.no (FI),Italy (e-mail: [email protected]; [email protected]; [email protected]).

L. Alparone is with the Dipartimento di Elettronica e Telecomunicazioni,Università di Firenze, 50139 Firenze, Italy (e-mail: [email protected]).

Digital Object Identifier 10.1109/LGRS.2007.900695

is to take a linear combination of the values of its samples. Thecoefficients of predictor can be calculated to yield a minimum-mean-squared error (mmse) over the whole image. Such aprediction is optimal only for stationary signals. Therefore,two strategies have been proposed to overcome this drawbackand obtain an adaptive prediction. Adaptive DPCM (ADPCM),in which the coefficients of predictors are continuously re-calculated from the incoming new data, is traditionally usedfor 1-D signals [2]. A more recent DPCM typology that issuitable for digital images is the classified DPCM, in whicha number of statistical classes of samples are preliminarilyrecognized, an optimized predictor is calculated for each class,and such predictors are switched, either hardly [3] or softly [4],to attain the best space-varying prediction. The two strategiesof hard/soft classified prediction will be referred to as adaptiveselection/combination of adaptive predictors (ASAP/ACAP).

Whenever multiband images are to be compressed, an ad-vantage may be taken from the spectral correlation of thedata in designing a prediction that is both spatial and spectralfrom a causal neighborhood of pixels [1]. Causal means thatonly previously scanned pixels on the current and previouslyencoded bands may be used to predict the current-pixel (CP)value. This strategy is more and more effective as the spectralcorrelation increases, as it happens with the hyperspectral data.

State-of-the-art schemes feature 3-D prediction, i.e., jointlyspatial and spectral, whose support comprises pixels belongingto the current and few previously scanned bands, one [5] or twoat most [6]–[8]. The ACAP encoder [4] has been extended bythe authors to 3-D data [9], in the same way as the 3-D ASAPencoder [6], by simply changing the 2-D neighborhood (spatial)[3] into a 3-D one spanning up to three previous bands.

A different approach that is specific to hyperspectral imagesis the extension of 3-D CALIC [10], which was originallyconceived for color images, having few spectral bands, to imagedata having a greater number of highly correlated bands. Themethod, which is referred to as M-CALIC [8], significantlyoutperforms the 3-D CALIC, to which it largely sticks, witha moderately increased computational complexity and absenceof setup parameters that are crucial for performances.

The nonstationarity characteristics of hyperspectral data inboth spatial and spectral domains, together with the com-putational constraints, make the jointly spatial and spectralprediction take a negligible extra advantage from a numberof previous bands greater than two [6]–[8]. However, if theoriginal hyperspectral pixel vectors are classified into spatiallyhomogeneous classes, whose map must be transmitted as sideinformation, then a purely spectral prediction may be profitablycarried out on pixel spectra belonging to each class by means

1545-598X/$25.00 © 2007 IEEE

Page 2: Crisp and Fuzzy Adaptive Spectral Predictions for Lossless and Near-Lossless Compression of Hyperspectral Imagery

AIAZZI et al.: ADAPTIVE SPECTRAL PREDICTIONS FOR LOSSLESS AND NEAR-LOSSLESS COMPRESSION 533

Fig. 1. Simplified flowchart of the classified-DPCM encoder. Prediction is accomplished either on a 3-D neighborhood (jointly spatial and spectral) or on a 1-Dneighborhood (purely spectral), whose pixels are labeled by an increasing Euclidean distance from the CP. The encoder is switchable between (upper branch) thefuzzy prediction and (lower branch) the crisp prediction.

of a different set of linear spectral predictors, as many as thebands, of length up to 20, i.e., spanning up to 20 previousbands. This original approach was recently introduced for thelossless compression [11] and provided the best compressionratios in the literature. Unfortunately, the computational effortin clustering the hyperspectral pixel vectors makes the methodunsuitable for applications requiring low complexity. In fact,computing power is limited for satellite onboard compression,and coding benefits must be traded off with the computationalcomplexity [5], [8], [12]. Furthermore, since the cost of over-head (classification map and spectral predictors) is independentof the target compression ratio, the method [11] seems to be notrecommendable for the near-lossless compression, which mightbe achieved in principle.

Before concluding this review, we wish to remind that aforerunner of the ACAP paradigm is the fuzzy 3-D DPCM [7],in which the prototype mmse spatial/spectral linear predictorswere calculated on clustered data, as it happens in [11].

II. CRISP/FUZZY CLASSIFIED 3-D/SPECTRAL PREDICTION

Fig. 1 summarizes the ACAP/ASAP encoder. The sole differ-ence between the proposed method and the methods describedin [6] (ASAP) and [9] (ACAP) is the prediction support,which is purely spectral instead of jointly spatial and spectral.The set of predictors is crucial for coding performances. Anefficient prediction should be capable of matching locally vary-ing spectral trends as much as possible. After preliminarilypartitioning the input image bands into small blocks, e.g.,8 × 8, a causal prediction support of size S is set, and the Scoefficients of an mmse linear predictor are calculated for eachblock by means of a least square (LS) algorithm.

The aforementioned process produces a large number ofpredictors; each is optimized for a single block of the currentband. The S coefficients of each predictor are arranged into anS-dimensional space. It can be noticed that statistically similarblocks exhibit similar predictors. Fig. 2 shows how predictorscalculated from a band of a true hyperspectral image tend tocluster in the hyperspace, instead of being uniformly spread.

Once M predictors, with M defined by user, have been foundout through a clustering algorithm, they are used to initializean iterative refinement procedure, either crisp or fuzzy. In theformer case, image blocks are assigned to the M classes based

Fig. 2. Example of (dots) block predictors of length three and (stars) fourclusters obtained through the fuzzy C-means algorithm.

on the minimum-mean-squared prediction error, and a newset of predictors is recalculated for each class. In the lattercase, pixels are given degrees of membership to predictors.The membership function of a pixel to a predictor is inverselyrelated to the average prediction error produced by that pre-dictor in a causal neighborhood of that pixel [4]. Then, eachpredictor is recalculated based on the membership of pixelsto it. The procedure is analogous to a relaxation labeling, inwhich the labeling is not crisp but fuzzy. The final sets ofrefined predictors, one per wavelength, are transmitted as sideinformation.

The coding phase is also different, depending on whether thecombination of predictors is hard or soft. In the former case(ASAP) [4], [6], each band is raster scanned, and the value ofeach pixel belonging to one block of the original partition ispredicted by using one out of the M predictors, whose residueexhibits a minimum mse. In the latter case (ACAP) [3], [9], thefinal prediction is a combination of the output of all predictors,according to the same fuzzy membership function used fortraining. The ASAP approach requires transmission of blocklabels; the ACAP approach does not, because the weights of thecombination are calculated from the previously encoded datathat are available at the decoder.

Page 3: Crisp and Fuzzy Adaptive Spectral Predictions for Lossless and Near-Lossless Compression of Hyperspectral Imagery

534 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 4, NO. 4, OCTOBER 2007

Fig. 3. Bit rates on disk for the lossless compression of the fourth scene ofAVIRIS Cuprite Mine ’97: S-RLP versus 3-D-RLP varying with the predictionlength. The number of classes (predictors) is always 16.

In the case of lossless compression, the integer-valued pre-diction errors are directly delivered to the context-coding sec-tion. Otherwise, they are uniformly quantized with a step size∆, to achieve L∞ error-bounded compression (near lossless)before entropy coding.

III. RESULTS AND COMPARISONS

Thorough experiments have been performed to evidence thecapabilities of the novel spectral predictions, compared with theanalogous 3-D predictions and with the most advanced DPCM-based schemes in the literature.

A. Data Set and Distortion Measurements

The data set comprises four hyperspectral images collected in1997 by the Airborne Visible InfraRed Imaging Spectrometer(AVIRIS), which is operated by NASA/JPL. The four test sitesare Cuprite Mine and Lunar Lake in Nevada, and Moffett Fieldand Jasper Ridge in California. All images comprise 224 bandsrecorded at different wavelengths in the range of 380–2500 nm,with a nominal spectral separation of 10 nm between twoadjacent bands. Each image is constituted by a variable numberof scenes of size 512 lines by 614 columns. All data that havebeen considered for compression are in radiance units.

Let {g(i, j)}, with 0 ≤ g(i, j) ≤ gfs, where gfs is the fullscale, denote an integer-valued N -pixel digital image and{g̃(i, j)} its distorted version, which is also integer valued,achieved by compressing {g(i, j)} and decoding the outcomebit stream. The widely used distortion measurements are thefollowing:

1) mse or squared L2 distance between the original anddistorted images (L2

2)

mse =1N

i

j

[g(i, j) − g̃(i, j)]2 (1)

2) maximum absolute distortion (MAD), or peak error, orL∞ distance between the original and distorted images

MAD = maxi,j

{|g(i, j) − g̃(i, j)|} (2)

Fig. 4. Bit rates on disk for the lossless compression of the fourth scene ofAVIRIS Cuprite Mine ’97: 16 classes and prediction of length 20 for bothmethods.

3) peak signal-to-noise ratio (PSNR)

PSNR(dB) = 10 log10

g2fs

mse + 112

(3)

in which the mse at denominator is incremented by thevariance of the integer roundoff error to handle the limitlossless case, when mse = 0. Thus, the PSNR will beupper bounded by 10 log10(12 · g2

fs), in the lossless case,to indicate that the analog signal detected by the sensorhas been quantized before being reversibly compressed.

B. Lossless-Compression Performance Comparison

The first experiment is a comparison between the spectraland the jointly spatial-spectral predictions. The fourth sceneof the test image Cuprite ’97 was reversibly compressed bymeans of both the crisp algorithms: the novel spectral relaxationlabeled prediction (S-RLP) and the earlier 3-D-RLP. Bit rates,which are reported in bits per pixel per band, include all codingoverheads: predictor coefficients, block labels, and arithmeticcodewords. The following work parameters have been selectedfor each encoder: 16 classes (i.e., 16 predictors for each λ) andprediction length, i.e., number of coefficients of each predictor,which is equal to 5, 14, and 20. The 3-D prediction of RLPis always carried out from a couple of previous bands, exceptfor the first band, coded in intramode, i.e., by 2-D DPCM, andthe second band, which is predicted from one previous bandonly. The purely spectral 1-D prediction of S-RLP is carried outfrom the available previous bands up to the requested predictionlength. The first band is coded with a 1-D spatial prediction, i.e.,across track only, for compliance with the band-interleaved-by-line (BIL) format.

As it appears from the plots in Fig. 3, the S-RLP outper-forms the 3-D-RLP, particularly when the prediction length islower, which is a case of interest for a customized satelliteonboard implementation. The performance of both S-RLP and3-D-RLP cannot be improved significantly by increasing thenumber and length of predictors, because the overhead infor-mation is increasing as well. Fig. 4 shows that the increment

Page 4: Crisp and Fuzzy Adaptive Spectral Predictions for Lossless and Near-Lossless Compression of Hyperspectral Imagery

AIAZZI et al.: ADAPTIVE SPECTRAL PREDICTIONS FOR LOSSLESS AND NEAR-LOSSLESS COMPRESSION 535

TABLE IBIT RATES (BIT/PEL/BAND ON DISK) FOR THE LOSSLESS COMPRESSION

OF AVIRIS 1997 TEST HYPERSPECTRAL IMAGES

in performance of S-RLP over 3-D-RLP is mostly concentratedin the near-infrared and short-wave infrared wavelengths andvanishes in the presence of absorption intervals.

The advantages of context modeling before the entropy cod-ing are negligible for both the 3-D-RLP and S-RLP codingsof hyperspectral images. The explanation is obvious whenone considers that the context modeling takes advantages forthe entropy coding of the residual correlation and, particu-larly, of the nonstationarity of residues [13]. Both S-RLP and3-D-RLP attain a decorrelation that is close to yield residuesthat are stationary and uncorrelated.

The comparison of results, which are carried out amongthe four algorithms by the authors and the three algorithmstaken from the most recent and advanced literature, will bedescribed hereafter. The original methods by the authors arebased on the crisp and fuzzy adaptive spectral prediction andare referred to as S-RLP and S-FMP (spectral fuzzy-matchingpursuits). The existing methods by the authors are based on thecrisp and fuzzy adaptive 3-D prediction and are referred to as3-D-RLP [6] and 3-D-FMP [9]. The other methods comparedare the clustered DPCM (C-DPCM) [11], M-CALIC [8], andthe up-to-date Spectral-oriented Least SQuare (SLSQ) encoder[5]. The three classification-based schemes (3-D-RLP, S-RLP,3-D-FMP, S-FMP, and C-DPCM) use 16 classes and predictorsof length 20. The bit rates refer to the whole hyperspectralimages, i.e., to all their scenes.

Table I shows the lossless bit rates that are produced bythe seven algorithms on the whole image test set. All scenesof each image have been compressed. The bit rates producedby M-CALIC are slightly lower, by few hundredths of bit,than those reported in [8], which were relative to a subpartof the fourth scene (256 lines only). The SLSQ and theM-CALIC have been developed for onboard applications and,hence, optimized for computational cost rather than for per-formances. Concerning the classified schemes, while 3-D-RLP,3-D-FMP, and, particularly, C-DPCM are definitely unsuit-able for the onboard compression, the S-RLP may be eas-ily adjusted to work in BIL format by taking the shape ofthe block more elongated across track than along the track,and by moving the training offline. Computing times ofS-RLP are about 10% lower than those of the 3-D-RLP, for thesame prediction length and number of predictors. The S-FMPis slightly more performing than the S-RLP but, unfortunately,about twice computationally more expensive. The amount ofcoding overhead is 0.027 bit/pel/band for the S-RLP and the3-D-RLP and 0.012 bit/pel/band for the S-FMP and the 3-D-FMP. Such values are relative to a 614 × 512 scene. The fuzzyschemes have lower overhead because the labels of predictorsare not transmitted.

Fig. 5. Performance plots for the near-lossless compression of AVIRISCuprite Mine ’97 test hyperspectral images. (a) PSNR versus bit rate and(b) MAD versus bit rate.

C. Near-Lossless Compression

In addition, some near-lossless compression tests have beenperformed on the fourth scene of Cuprite. They follow the sametrends as the lossless case for the S-RLP and are analogous tothose of the 3-D-RLP, reported in [6], and of the M-CALIC,reported in [8]. The near-lossless bit-rate profiles are rigidlyshifted downward from the lossless case (see Fig. 4) by amountsthat are proportional to the logarithm of the quantization-induced distortion. This behavior does not occur for low bitrates, because of the quantization noise feedback effect, whichimplies that the prediction becomes poorer and poorer as it isobtained from the highly distorted reconstructed samples usedby the predictor, which must be aligned to the decoder.

Rate-distortion (RD) plots are shown in Fig. 5(a) for S-RLPand 3-D-RLP operating with M = 16 predictors and S = 20coefficients per predictor. The PSNR of the whole image iscalculated from the average mse of the sequence of bands.Due to the sign bit, the full scale gfs in (3) was set equal to215 − 1 = 32 767 instead of 65 535, since small negative values,which are introduced by removal of dark current during calibra-tion, are very rare and totally missing in some scenes. Hence,the PSNR attains a value of 10 log10(12g2

fs) ≈ 102 dB, dueto the integer roundoff noise only, when reversibility is reached.The correction for roundoff noise has a twofold advantage.First, lossless points appear inside the plot and can be directly

Page 5: Crisp and Fuzzy Adaptive Spectral Predictions for Lossless and Near-Lossless Compression of Hyperspectral Imagery

536 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 4, NO. 4, OCTOBER 2007

compared. Second, all the PSNR-bit rate plots are straight lineswith a slope of ≈6 dB/bit for bit rates larger than, for example,1 bit/pel, in agreement with the RD theory [2] (with a uni-form threshold quantizer). For lower bit rates, the quantizationnoise feedback causes an exponential drift from the theoreticalstraight line.

Interestingly, the difference in bit rate between the S-RLPand the 3-D-RLP at a given PSNR is only two hundredths ofbit/pel near the lossless point but grows up to one tenth ofbit/pel at a rate equal to 1.5 bit/pel, which is typical of a high-quality lossy compression. Comparisons with an up-to-datealgorithm [14], implementing the state-of-the-art JPEG2000multicomponent approach, reveal that the S-RLP outperformsthe wavelet-based encoder by approximately 3.5 dB at 2 bit/pel.However, this difference reduces to 2.5 dB at 1 bit/pel andvanishes around 0.25 bit/pel, because of the quantization noisefeedback effect, which is missing in the 3-D wavelet coder. Thismoderate loss of performance is the price that the embeddedcoders have to pay. The DPCM does not allow progressivereconstruction to be achieved but yields higher PSNR, at leastfor medium-high bit rates. The further advantage of DPCM isthat it is near lossless, unlike JPEG 2000, which can be madelossless but not near lossless, unless an extremely cumbersomequantizer is employed, with a further loss in performances [15].

The near-lossless performance is shown in the MAD-bitrate plots of Fig. 5(b). Since the average standard deviationof the noise was found to be around 10, a virtually losslesscompression (maximum compression-induced absolute errorlower than the standard deviation of the noise [16]) is givenby S-RLP at a bit rate around 1.6 bit/pel/band, thereby yield-ing a compression ratio CR = 10 with respect to the uncom-pressed data and CR ≈ 3 relative to the lossless compression.Fig. 5 shows that the increment in performance of S-RLPover 3-D-RLP is more relevant for such a bit rate than forhigher rates.

IV. CONCLUSION

This letter has demonstrated that a purely spectral predictioncalculated on classified hyperspectral pixel vectors is equivalentto a classified adaptive prediction carried out on the unclassifiedpixel spectra. Although prediction is purely spectral, and hence1-D, the spatial correlation is removed by the training phase ofpredictors, which is aimed at finding some statistically homo-geneous spatial classes matching the set of prototype spectralpredictors. The advantage is twofold, because clustering ofpredictors is much less time consuming than clustering of thewhole data cube and because the training phase, i.e., clusteringof predictors for each wavelength, may be moved offline insteadof being carried out on the same data that will be coded. Otherdata acquired by the same instrument are likely to exhibitthe same spectral trends, varying with landscape. Thus, theprediction will be slightly less fitting, but the overhead ofpredictors calculated online will be saved. Preliminary resultsof offline training have shown that the average increment in bitrate is about three hundredths of a bit, when one scene of thesame image is coded with predictors calculated and trained onanother scene.

As a last remark, we would like to remind that an algo-rithm that is suitable for the satellite onboard compression,either lossless or near lossless, should have several favorablecharacteristics, such as low complexity, low power and storagerequirements, capability of working in BIL format, and quasi-constant bit rate varying with scenes. However, it should beevaluated on raw, i.e., uncalibrated data, as they are producedby the instrument, not on the calibrated data, which are usedfor applications. Therefore, a performance ranking of methodsmight be different on raw data. In particular, methods exploitingthe sparse histograms that are typical of calibrated data, suchas the up-to-date top-performing spectral predictor based onlookup tables [17], may be less performing on raw data thanon the calibrated data.

REFERENCES

[1] R. E. Roger and M. C. Cavenor, “Lossless compression of AVIRIS im-ages,” IEEE Trans. Image Process., vol. 5, no. 5, pp. 713–719, May 1996.

[2] N. S. Jayant and P. Noll, Digital Coding of Waveforms: Principles andApplications to Speech and Video. Englewood Cliffs, NJ: Prentice-Hall,1984.

[3] B. Aiazzi, L. Alparone, and S. Baronti, “Near-lossless image compres-sion by relaxation-labelled prediction,” Signal Process., vol. 82, no. 11,pp. 1619–1631, Nov. 2002.

[4] B. Aiazzi, L. Alparone, and S. Baronti, “Fuzzy logic-based matchingpursuits for lossless predictive coding of still images,” IEEE Trans. FuzzySyst., vol. 10, no. 4, pp. 473–483, Aug. 2002.

[5] F. Rizzo, B. Carpentieri, G. Motta, and J. A. Storer, “Low-complexitylossless compression of hyperspectral imagery via linear prediction,”IEEE Signal Process. Lett., vol. 12, no. 2, pp. 138–141, Feb. 2005.

[6] B. Aiazzi, L. Alparone, and S. Baronti, “Near-lossless compression of3-D optical data,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 11,pp. 2547–2557, Nov. 2001.

[7] B. Aiazzi, P. Alba, L. Alparone, and S. Baronti, “Lossless compressionof multi/hyper-spectral imagery based on a 3-D fuzzy prediction,” IEEETrans. Geosci. Remote Sens., vol. 37, no. 5, pp. 2287–2294, Sep. 1999.

[8] E. Magli, G. Olmo, and E. Quacchio, “Optimized onboard lossless andnear-lossless compression of hyperspectral data using CALIC,” IEEEGeosci. Remote Sens. Lett., vol. 1, no. 1, pp. 21–25, Jan. 2004.

[9] B. Aiazzi, L. Alparone, S. Baronti, and L. Santurri, “Near-lossless com-pression of multi/hyperspectral images based on a fuzzy-matching pur-suits interband prediction,” in Proc. SPIE—Image and Signal Processingfor Remote Sensing VII, S. B. Serpico, Ed., 2002, vol. 4541, pp. 252–263.

[10] X. Wu and N. Memon, “Context-based lossless interband compression—Extending CALIC,” IEEE Trans. Image Process., vol. 9, no. 6, pp. 994–1001, Jun. 2000.

[11] J. Mielikainen and P. Toivanen, “Clustered DPCM for the lossless com-pression of hyperspectral images,” IEEE Trans. Geosci. Remote Sens.,vol. 41, no. 12, pp. 2943–2946, Dec. 2003.

[12] B. Aiazzi, L. Alparone, S. Baronti, A. Bertoli, C. Lastri, F. Lotti,E. Magli, G. Olmo, and B. Penna, “Advanced methods for onboard loss-less compression of hyperspectral data,” in Proc. SPIE—Mathematics ofData/Image Coding, Compression, and Encryption VI, With Applications,M. S. Schmalz, Ed., Dec. 2003, vol. 5208.

[13] B. Aiazzi, L. Alparone, and S. Baronti, “Context modeling for near-lossless image coding,” IEEE Signal Process. Lett., vol. 9, no. 3, pp. 77–80, Mar. 2002.

[14] B. Penna, T. Tillo, E. Magli, and G. Olmo, “Progressive 3-D coding ofhyperspectral images based on JPEG 2000,” IEEE Geosci. Remote Sens.Lett., vol. 3, no. 1, pp. 125–129, Jan. 2006.

[15] A. Alecu, A. Munteanu, J. Cornelis, S. Dewitte, and P. Schelkens,“Wavelet-based scalable L-infinity-oriented compression,” IEEE Trans.Image Process., vol. 15, no. 9, pp. 2499–2512, Sep. 2006.

[16] C. Lastri, B. Aiazzi, L. Alparone, and S. Baronti, “Virtually losslesscompression of astrophysical images,” EURASIP J. Appl. Signal Process.,vol. 2005, no. 15, pp. 2521–2535, 2005.

[17] J. Mielikainen, “Lossless compression of hyperspectral images usinglookup tables,” IEEE Signal Process. Lett., vol. 13, no. 3, pp. 157–160,Mar. 2006.