forecasting the spatial variability of the indian monsoon rainfall using canonical correlation

12
INTERNATIONAL JOURNAL OF CLIMATOLOGY, VOL. 16, 1379-1390 (1996) FORECASTING THE SPATIAL VARIABILITY OF THE INDIAN MONSOON RAINFALL USING CANONICAL CORRELATION K. D. PRASAD AND S. V. SMGH Indian Institute of Tropical Meteorology, Pune-411008, India email: [email protected] email: svs-701 @ncrnwf.ernet.in Received 18 July 1994 Accepted 7 March 1996 ABSTRACT The application of a canonical correlation model to the long-range forecast of the spatial variability of the Indian monsoon (Juneeptember) rainfall has been demonstrated. The predictands used in the model are the summer monsoon rainfall of 29 contiguous meteorological subdivisions of India and the predictors are the 500 hPa ridge axis position over India for April, the Darwin surface pressure tendency (April-January), the sea-surface temperature of the central and eastern equatorial Pacific for the five successive months preceding the monsoon (January to May) and the rainfall of the southernmost subdivision of India (Kerala) for April. The model is developed on 30 years (19354968) of data and tested on 16 independent years thereafter. The model demonstrates positive skill for the large contiguous meteorological subdivisions of India using the first canonical mode (found significant). The root-mean-square error and the absolute error between the observed and the predicted rainfall for different meteorological subdivisions are of the order of 1 cm. The high skill score (20.3) is found particularly for the meteorological subdivisions lying in west-central India. The performance of the model appears to be better than that of the multiple regression model developed earlier by Prasad and Singh. The combined model (containing the first and the second canonical modes) appears to perform even better than that of the single model. These results, therefore, seem to be important in view of the long-range forecast of the spatial variability of the Indian monsoon rainfall. KEY WORDS: Indian monsoon rainfall; long-range forecast; canonical correlation technique; spatial variability; EOF analysis. 1. INTRODUCTION Prediction is an important aspect of meteorological research and has been attempted using both the dynamical and the empirical approaches. The forecast of the tropical rainfall on the longer time-scales, particularly, appears to be difficult because of the complexity of the atmospheric system. In India, attempts have been made to forecast the monsoon rainfall for more than 100 years. A comprehensive review of this work has been presented by Hastenrath (1 987). These studies show that the summer monsoon rainfall is related to the antecedent local or global climatic variables, the knowledge of which can be utilized in improving the forecast of the monsoon rainfall variability. Different statistical models have been used for this purpose, such as the multiple linear regression model, the stochastic dynamic model, and the power regression model (Kung and Sharif, 1980; Thapliyal, 1982; Parthasarathy and Pant, 1985; Bhalme et al., 1986; Hastenrath, 1987, 1988; Shukla and Mooley, 1987; Gowarikar et al., 1989; Prasad and Singh, 1992), which show some skill in forecasting the monsoon rainfall of India. These models consider the forecast of the monsoon rainfall of India as a whole or a large region thereof, but little attention has been paid to forecast the subdivisional scale rainfall variability. Recently, Prasad and Singh (1 992) have shown that regional and global climate variables could be used in forecasting the monsoon rainfall on reduced spatial and temporal scales. The present study analyzes the monsoon rainfall of contiguous meteorological subdivisions of India using the statistical approach of canonical correlation analysis, with the purpose of forecasting it on longer time-scales. CCC 0899-8418/96/121379-12 0 1996 by the Royal Meteorological Society

Upload: s-v

Post on 06-Jun-2016

215 views

Category:

Documents


3 download

TRANSCRIPT

INTERNATIONAL JOURNAL OF CLIMATOLOGY, VOL. 16, 1379-1390 (1996)

FORECASTING THE SPATIAL VARIABILITY OF THE INDIAN MONSOON RAINFALL USING CANONICAL CORRELATION

K. D. PRASAD AND S. V. SMGH Indian Institute of Tropical Meteorology, Pune-411008, India

email: [email protected] email: svs-701 @ncrnwf.ernet.in

Received 18 July 1994 Accepted 7 March 1996

ABSTRACT

The application of a canonical correlation model to the long-range forecast of the spatial variability of the Indian monsoon (Juneeptember) rainfall has been demonstrated. The predictands used in the model are the summer monsoon rainfall of 29 contiguous meteorological subdivisions of India and the predictors are the 500 hPa ridge axis position over India for April, the Darwin surface pressure tendency (April-January), the sea-surface temperature of the central and eastern equatorial Pacific for the five successive months preceding the monsoon (January to May) and the rainfall of the southernmost subdivision of India (Kerala) for April. The model is developed on 30 years (19354968) of data and tested on 16 independent years thereafter.

The model demonstrates positive skill for the large contiguous meteorological subdivisions of India using the first canonical mode (found significant). The root-mean-square error and the absolute error between the observed and the predicted rainfall for different meteorological subdivisions are of the order of 1 cm. The high skill score (20.3) is found particularly for the meteorological subdivisions lying in west-central India.

The performance of the model appears to be better than that of the multiple regression model developed earlier by Prasad and Singh. The combined model (containing the first and the second canonical modes) appears to perform even better than that of the single model. These results, therefore, seem to be important in view of the long-range forecast of the spatial variability of the Indian monsoon rainfall.

KEY WORDS: Indian monsoon rainfall; long-range forecast; canonical correlation technique; spatial variability; EOF analysis.

1. INTRODUCTION

Prediction is an important aspect of meteorological research and has been attempted using both the dynamical and the empirical approaches. The forecast of the tropical rainfall on the longer time-scales, particularly, appears to be difficult because of the complexity of the atmospheric system. In India, attempts have been made to forecast the monsoon rainfall for more than 100 years. A comprehensive review of this work has been presented by Hastenrath (1 987). These studies show that the summer monsoon rainfall is related to the antecedent local or global climatic variables, the knowledge of which can be utilized in improving the forecast of the monsoon rainfall variability. Different statistical models have been used for this purpose, such as the multiple linear regression model, the stochastic dynamic model, and the power regression model (Kung and Sharif, 1980; Thapliyal, 1982; Parthasarathy and Pant, 1985; Bhalme et al., 1986; Hastenrath, 1987, 1988; Shukla and Mooley, 1987; Gowarikar et al., 1989; Prasad and Singh, 1992), which show some skill in forecasting the monsoon rainfall of India. These models consider the forecast of the monsoon rainfall of India as a whole or a large region thereof, but little attention has been paid t o forecast the subdivisional scale rainfall variability. Recently, Prasad and Singh (1 992) have shown that regional and global climate variables could be used in forecasting the monsoon rainfall on reduced spatial and temporal scales. The present study analyzes the monsoon rainfall of contiguous meteorological subdivisions of India using the statistical approach of canonical correlation analysis, with the purpose of forecasting it on longer time-scales.

CCC 0899-8418/96/121379-12 0 1996 by the Royal Meteorological Society

1380 K. D. PRASAD AND S. V. SlNGH

Canonical correlation analysis was fist considered by Glahn (1968) in the analysis of geophysical data. It has been used subsequently by Davis (1977), Barnett (1981), and Nicholls (1987) in meteorological and geophysical contexts. The leading work in prediction using this technique is that by Barnett and Preisendorfer (1987), and Graham et al. (1 987a,b). It is now being applied increasingly to meteorological analysis and prediction (Deque and Servian, 1989; Metz, 1989; Graham, 1990; Bretherton et al., 1992; Bamston and Ropelewski, 1992; Wallace et al., 1992). The technique is a generalization of the multiple regression analysis that allows one field to be regressed on another field and depicts the major patterns of covariance between the fields analysed (Graham et al., 1987b). It calculates the linear combinations of a set of predictands that maximize the relationship in a least- square sense to a similarly calculated linear combination of a set of predictors. In this method the cross-data set covariance is maximized with each successive mode. The method has the advantage of being operated on a full field of information and defines objectively the highly related patterns of the predictands and predictors. It also allows the filtering of the undesired random covariability between the predictor and the predictand fields.

The presentation that follows in this paper is divided into seven parts. Section 2 describes the data and section 3 outlines the development of the model. Sections 4 and 5 describe the design of the model and the canonical loading patterns of the predictands and predictors respectively. The independent verification of the model is described in section 6 and section 7 provides the conclusion.

2. DATA The monthly rainfall data of 29 contiguous meteorological subdivisions of India (Figure 1) for the period 1939 to 1984 are obtained from Parthasarthy et al. (1987). The data for the 500 hPa ridge axis positions along 75"E over India for April from 1939 to 1975 are obtained from Banerjee et al. (1978) and they have been updated through 1984 by the analysis of the monthly wind fields. The Darwin monthly mean sea-surface pressure data are obtained from V.E. Kousky, Climate Analysis Centre, USA. The Darwin surface pressure tendency is defined as April minus January surface pressure at Darwin. This tendency is found to be better associated with the subsequent Indian monsoon rainfall than the Darwin surface pressure itself and hence it has been used as a predictor in the study. The homogenized monthly mean sea-surface temperatures for the central and eastern equatorial Pacific, January through to May, for the period 1939 to 1984, are taken from Wright (1 989). In this, the temperature index has been defined as the mean sea-surface temperature over the region 6"-2"N, 170"-9O"W; 2 " W S , 180"-90"W; 6"-1O"S, 150"-110"W.

3. ANALYSIS TECHNIQUE Forecasting through the canonical correlation technique is discussed briefly here and details can be found in the references cited in this section. Consider R(x,t) to be the predictand and Y(x',t) to be the predictor data sets, where x = 1, p and x' = 1, q represent the number of variables in the predictand and the predictor sets, respectively, and t = 1, nt denotes the number of observations in each set. These data sets are normalized (as discussed in section 4) and then they are orthogonalized individually using EOF analysis. The details of the EOF analysis procedure are discussed by Davis (1976), Bamett and Hasselman (1 979) and partially by Prasad and Sikka (1982). The EOFs of each of the predictand and the predictor fields are truncated by applying both the subjective as well as the objective criteria (as discussed in section 4). This process, therefore, eliminates the redundancy from the predictand and predictor fields and also reduces the number of predictors. The coefficients of the truncated EOFs of the predictand and the predictor fields are then subjected to canonical correlation analysis, which gives rise to the new sets of canonical variates as the linear combinations of their respective variables. A useful discussion of the method is provided by Glahn (1968), Davis (1 977), Barnett and Preisendorfer (1 987) and Nicholls (1 987). The predictor ( Y ) and predictand (R) fields can be represented as the linear combinations of their canonical variates. (These canonical variates are denoted uk and for the predictand and the predictor fields, respectively.)

INDIAN MONSOON RAINFALL 1381

1 N A w n 2 s. Amam

4 C. W. kngol S Orirra 6 Bihor Platmu 7 Bihar Plain 6 u. P. E r n 9 u. P. W o d

I0 Horyono

3 s. nw. B O ~ I

II CurJob 21 VidorMa I2 Rojsthan Wm 22 C.Andhra Prodrrh 13 Rajstha Loat 23 trlongano I4 Y.P.wlst 24 Rqol romo 16 Y. P. Eort 25 Tamil Nadu I6 Bujorot 26 Cooatal Kornatoho 17 SOMmOhtrO a Kutth 2 7 Karnotoha North I8 Konkon 28 Karnotoha South I S Y. Yolroshtro 29 Krrola 20 Yorotkado

Figure 1 . The monsoon rainfall of 29 contiguous meteorological subdivisions of India considered as predictands

Where g, and hk are the jth and kth canonical loading patterns of the predictors and the predictands, respectively. These canonical loading patterns, which are the vectors, are written as

g j ( x ’ ) = ( Y ( t , x’)vj(t))t

and

= (R(t* X)U&(f)), (2)

The components of these vectors show the correlations at a specific location (x or x‘) of Y and R with their respective canonical variates. The canonical loading patterns as they are defined are not unit vectors, nor are they mutually orthogonal. It is convenient to deal with the normalized version of g, (or h k ) patterns. If the normalized canonical patterns are denoted by g‘/ then

g;W) = g j ( x / ) / [ ~ ( 9 i ( x ’ ) ) I ” ~ (3) i

so that

1382 K. D. PRASAD AND S. V. SlNGH

A similar expression for hi can be obtained by substituting hk in place of gi in equation (3). The normalized patterns g'/ represent a set of data fields 'stacked' in time and as such the relative predictive importance of each of the data fields can be easily determined.

The least-square estimate of R by Y is

where x = 1,2,3,. . . ,pi q" 5 q < p t=1,2,3 ,..., n

Variable mi is the jth canonical correlation and q" is an appropriate number of the canonical modes to be retained in the prediction equation. The truncation of the canonical modes and the different measures used for the evaluation of the forecast skill are discussed in section 4.

4. MODEL DESIGN

The forecast of the monsoon rainfall for each of the contiguous meteorological subdivisions of India (the normals and standard deviations for which are presented in Figure 2) was attempted by Prasad and Singh (1992) using the multiple regression technique. They found high skill for the meteorological subdivisions of west-central India. In the present study we further explore the predictability of the monsoon rainfall of these meteorological subdivisions by applying the canonical correlation technique, which may provide a better result than that obtained through the multiple linear regression technique.

The predictands considered in the study are the monsoon (June-September) rainfall of each of the 29 contiguous meteorological subdivisions of India (Figure 1) and the predictors are eight different regional and global parameters: the 5OOhPa ridge axis position along 75"E over India during April; the Darwin surface pressure tendency (April minus January surface pressure); the sea-surface temperature of the central and eastern equatorial Pacific for the five individual months preceding the monsoon (i.e. January-May); and the rainfall of Kerala for April. The first two of these parameters have been used most frequently in the prediction of the Indian monsoon rainfall and their importance has been established by many workers (Banerjee et al., 1978; Shukla and Paolino, 1983; Mooley et al., 1986; Singh et al., 1986; Shukla and Mooley, 1987; Hastenrath, 1988; Prasad and Singh, 1992). The delayed northward propagation of the 5OOhPa ridge axis position over India represents the delay in the seasonal cycle of the tropical circulation, and the monsoon rainfall in such a case is likely to be deficient (and vice versa). The Darwin surface pressure tendency represents the phases of the Southern Oscillation, and a negative bositive) phase of the Southern Oscillation is likely to produce below (above) normal

80- NORMAL -I6 ; STANDARD DEVIATION . -

c z -12 e

t - 5 >

u ----------- 5 60- -

- 8 . P

= 4

a

- 3 In 0 1 1 ' 0 0 S 10 IS 20 2s 30

SUBDIVIStON NO

Figure 2. The normal and standard deviation of the summer (June4eptembex average) monsoon rainfall of India, per subdivision. The subdivision numbers listed are as in Figure 1

INDIAN MONSOON RAINFALL 1383

monsoon rainfall over the Indian subcontinent. The warming in the sea-surface temperature over the central and eastern equatorial Pacific, associated with an El Nifio event, may appear in the pre-monsoon month. In such cases the area is converted into a zone of ascending motion, and a branch of compensating descending motion falls over the monsoon region causing deficient rainfall over the Indian subcontinent. The rainfall parameter during the pre- monsoon month (April) over Kerala has also been considered as a predictor. The pre-monsoon convective activity over Kerala, where the monsoon rainfall sets in first of all, shows the association with the subsequent monsoon rainfall of the meteorological subdivisions of India. The association is found to be better for the April rainfall than the May rainfall of Kerala. This region seems to be one of concentrated surface heating during April. It appears that the latent heat provided by the pre-monsoon convective activity over Kerala contributes to some extent to the building up of the monsoon circulation over the Indian subcontinents.

Forty-six years of recent data have been used, the first 30 years (1939-1968) in the development of the model and the remaining 16 years (1 969-1984) in the verification of the model. In order to provide equal weights to all the variables and to provide a correlation-based canonical model, the predictor and predictand data sets are normalized by subtracting the respective mean and dividing by the respective standard deviation, based on the data from 1939-1968. Further, in order to eliminate the redundancy from the data and to reduce the degree of overfitting in the model, the predictand and predictor data sets are individually orthogonalized by subjecting them to EOF analysis. The pre-orthogonalization of the data sets also safeguards against any degeneracies likely to be introduced during the inversion of the matrix in the model-building process.

Because the model performance depends largely on the input variables, the selection procedure of the EOFs from the predictand and the predictor fields is quite important. We tried several objective procedures, such as rule N (Overland and Preisendorfer, 1982), scree test (Cattell, 1966), and the eigenvalue 2 1 .O criteria, for the selection of EOFs from the predictand and the predictor fields. (Table I11 shows the number of EOFs retained from the predictand and the predictor fields in each of these cases.) Rule N appears to be quite stringent, selecting comparatively fewer EOFs, namely four and two from the predictand and the predictor fields respectively. The eigenvalue 2 1.0 criterion selected a reasonable number of EOFs (i.e. seven) from the predictand field but relatively fewer EOFs (i.e. three) from the predictor field. The scree test on the other hand selected six EOFs from each of the predictand and the predictor fields, and consequently a large proportion of the variance, particularly of the predictor field (i.e. about 90 per cent; see Table 11) is retained in this case. From the above results it appears that the results of the objective selection procedures should be applied with some caution, and that the subjective criterion for the selection of EOFs should also be given due consideration. We therefore applied the subjective criterion to the selection of a suitable number of EOFs from each of the predictand and the predictor fields. In the subjective case we retained the first six EOFs from the predictand field and the first four EOFs from the predictor field, explaining 71 and 87 per cent of the total variance respectively. The higher order EOFs, explaining 5 per cent or less of the total variance (Tables I and 11), therefore have been discarded from both the predictand and the predictor fields, rendering the new set of data more or less noise-free.

The coefficients of EOFs of the predictand and the predictor fields thus obtained have been subjected to canonical correlation analysis, and the first two canonical correlations are detailed in Table 111. The first canonical correlation is found significant, by applying the Bartlett classical confirmatory test, in each case of the analyses. As the truncation of the model based on the significance of the canonical correlation seems to be quite rigid, the performance of the model is examined both in the case of the canonical mode one as well as in case of the canonical modes one and two combined. The Heidke skill score (the details of which are discussed in section 4) 2 0.3 obtained for the number of the subdivisions in the case of the single model is presented in Table 111. In each

Table 1. Variance and cumulative variance explained (in per cent) by the EOF (the first 12) modes of the predictands

EOF mode

1 2 3 4 5 6 7 8 9 10 1 1 12

Variance 27.9 13.8 10.1 7.3 6.3 5.4 4.9 3.7 3.5 2.6 2.5 2.3

variance 27.9 41.7 51.8 59.1 65.4 70.8 75.6 79.3 82.9 85.5 88.0 90.3 Cumulative

1384 K. D. PRASAD AND S. V. SINGH

Table 11. Variance and cumulative variance explained (in per cent) by the EOF modes of the predictors

EOF mode

1 2 3 4 5 6 7 8

Variance 47.5 18.2 12.5 8.5 6.9 3.9 1.6 0.8

variance 473 65.7 78.2 86.7 93.6 97.5 99.2 100.0 Comulative

Table 111. The number of EOF modes retained from (a) the predictand and (b) predictor fields, the first two canonical correlations (c.c.) and the number of subdivisions having the Heidke skill score (S.S.) Z 0.3 in (a) single and (b) combined

models for each of the analysis criteria applied

Truncation criteria Number of subdivisions EOF modes retained having S.S. > 0.3

First Second (a) (b) C.C. C.C.

(a) (b)

Subjective 6 4 0.92 0.50 9 11 Scree test 6 6 0.92 0.68 3 4 Rule N 4 2 0.84 0.18 6 7 Eigen value 21 .O 7 3 0.93 0.39 9 10

of the cases of the subjective and the eigenvalue ~ 1 . 0 criteria, nine meteorological subdivisions are found to have a skill score 2 0.3. The results show further improvement in the case of the combined model (i.e. with the canonical modes one and two considered together) and the number of subdivisions with a skill score ( 20.3) rises to 11 , particularly in the case of the subjective criterion. It appears that the number of EOFs selected from each of the predictand and the predictor fields has been quite appropriate in the case of the subjective criterion, retaining a proportionate amount of the variance from both the fields. The results in the case of the subjective criterion appear to be the best of all the cases considered, and further discussion of the results is restricted to this case only.

5 . CANONICAL LOADING PATTERNS

Similar to EOF patterns, in the canonical correlation analysis we obtain the canonical loading patterns of the predictands and the predictors (see section 3, equation (1)). When they are normalized, the relative importance of each of the data fields can be determined easily. The fist two normalized canonical loading patterns of the predictors are presented in Table IV. Similarly, the fist two patterns of the predictands are shown in map form in Figure 3. The hindcast skills for the fist two canonical modes are also presented in Table V If mi represents the jth canonical correlation, the hindcast skill for thejth canonical mode is defined as (m:/ x:, mj) x 100, where j = 1 and N is the total number of the canonical modes obtained in the analysis. The fraction of the total variance explained by thejth canonical mode is thus represented by thejth hindcast skill. The first canonical loading pattern (Figure 3) explains 70 per cent of the hindcast skill, which is found to be associated mainly with the 500 hPa ridge axis position, followed by the Darwin surface pressure tendency, the sea-surface temperature of the

Table IV. The first two normalized canonical loading patterns of the predictors ~ ~~~ ~~~ ~~

Patterns Pacific sea-surface temperature Rainfall Darwin of Keral pressure 500 hPa ridge

January February March April May for April tendency over India

1 0.03 0.06 0.08 0.20 0.39 0.14 0.5 1 - 0.72 2 0.29 0.30 0.13 - 0.18 - 0.30 - 0.75 - 0.03 - 0.35

INDIAN MONSOON RAINFALL 1385

Figure 3. The first two normalized canonical loading patterns of the predictands for (a) the fist and (b) the second patterns

Table V. Canonical eigenvalues (m2), canonical correlations (mi), hindcast skill and chi-square with degrees of freedom (NDF) for the first two canonical modes

Canonical Canonical Canonical Hindcast Chi-square NDF mode eigenvalue correlation skill

($1 (mi)

1 2

0.85 0.92 70.2 54.3 24 0.25 0.50 20.4 9.4 15

central and eastern equatorial Pacific (May), and rainfall of Kerala for April. The patttern depicts that rainfall variations over all the Indian subcontinent, except the southern part of north-east India, are in phase, with the high loadings lying over west-central India. It also appears from this pattern that rainfall over north-east India is less predictable (owing to less variability of rainfall over the region) and that it is more predictable over west-central India, which is the region of high variability of rainfall.

1386 K. D. PRASAD AND S. V. SlNGH

The second canonical loading pattern explains 20 per cent of the hindcast skill and the main contribution to the variance of this mode is found by rainfall of Kerala, followed by the 500 hPa ridge axis position and the sea- surface temperatures of the central and eastern equatorial Pacific (January, February and May). The pattern shows rainfall variations over north-west India in opposite phase to the rest of the country. The high loadings of the pattern are located generally over the northern part of the country (north of 20"N latitude), and hence in the second canonical loading pattern the rainfall over the northern part appears to be more predictable than the southern part of the country. In the following section we will see that the first two canonical modes combined provide a positive skill score for a large contiguous meteorological subdivision of India, which was not likely to be arrived at by the multiple linear regression technique (Prasad and Singh, 1992).

6. INDEPENDENT VERIFICATION OF THE MODEL

The canonical correlation model developed for forecasting the monsoon rainfall of the meteorological subdivisions has been tested on 16 (1969-1984) years of independent data. In addition to the Heidke skill score (Livezey, 1990) the other statistics of forecast performance, such as correlation coefficient (CC), root-mean- square error (RMSE), mean absolute error (MAE) and bias, as used by Nicholls (1984) and Hastenrath (1988), have been evaluated and are presented in Table VI. The Heidke skill score is defined as (C - E)/ (T - E), where

Table VLVarious measures of forecast performance obtained for the meteorological subdivisions based on 16 independent cases

~~~ ~~~~

Subdivision Subdivision Correlation Root-mean- Bias Absolute number listed as in Figure 1.

coefficient square error error

1 N. Assam -0.13 1.2 0.0 1 .o 2 S. Assam 0.04 1 *4 -0.9 1.2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

S.H.W. Bengal G.W. Bengal Orissa Bihar Plateau Bihar Plain U.P. East U.P. West Haryana Punjab Rajasthan West Rajasthan East M.P. West M.P. East Gujarat Saurashtra and Kutch Konkan Maharashtra Marathwada Vidarbha Andhra Pradesh Telangana Rayalseema Tamil Nadu C. Kamataka Karnataka North Karnataka South Kerala

0.29 -0.36 0.04 0.5 1 * 0.34 0.49* 0.45 0.63* 0.70* 0.56* 0*77* 0.62* 0.43 0.63* 0.26 0.50 0.54 0.47 0.57* 0.44 0.45 0.23 0.39 0*58* 0.41 0.30 0.46

0.9 1.5 1 *4 1 .o 1 *4 1.3 1.1 1 .o 1.1 1.1 0.7 0.8 0.9 0.8 1 .o 0.8 1.3 1.3 1 *o 1 .o 1 a4 1.1 1 .o 1.2 1.1 I .o 0.9

-0.2 0.2

-0.9 0.1 0.4 0.6 0.2 0.5 0.5 0.4 0.2 0.2

-0.3 0.0 0.1 0.3 0.3

-0.1 -0.1 -0.0

0.2 0.2 0.2 0.6 0.1 0.1

-0.0

0.8 1.2 1.1 0.9 1.1 1 .o 0.9 0.9 1.0 1 .o 0.5 0.7 0.7 0.6 0.8 0.6 1.1 1 .o 0.8 0.8 1.1 0.9 0.8 0.9 0.9 0.8 0.7

*Significant at the 5 per cent level (CC20.48).

INDIAN MONSOON RAINFALL 1387

C is the number of cases of the correct forecasts, E is the number of correct forecasts expected by chance, and T is the total number of forecasts. The skill score vanes from zero (when C = E) to 1 (when C = T). This has been computed from a 3 x 3 contingency table, where the observed as well as the corresponding predicted rainfall are classified in three equiprobable categories. The limits of these categories have been determined from the dependent sample in order to avoid bias (Livezey, 1990).

The Heidke skill scores obtained have been presented in map form in Figure 4(a) for the single model and in Figure 4(b) for the combined model (containing the canonical modes one and two combined). A large contiguous meteorological subdivision (Figure 4(a)) has a positive skill score ( 2 0.1) in the single model (containing the fist canonical mode). High skill scores (2 0.3) are noted for subdivisions lying in west-central India (i.e. U.P. east, Haryana, Rajsthan east, Madhya M e s h east, Gujarat, Madhya Maharashtra, Marathwada, Vidarbha and Kamataka south). In the combined model some improvement in the forecast skill is noticed, firstly in the areal extent of the positive skill score (20.1) and secondly in the magnitude of the skill score over central India. The subdivisions for which the improvement has been noted are Punjab, Rajsthan west and Konkan, with a fall in the skill score for Madhya Maharashtra (due to the inclusion of some random noise in the model because of the second canonical mode).

Figure 4. The Heidke skill score obtained in forecasting the monsoon rainfall of the meteorological subdivisions of India for (a) the model containing the first canonical mode and (b) the model containing the first and the second canonical modes.

1388 K. D. PRASAD AND S. V. SINGH

Figure 5. The Heidke skill score obtained in forecasting the monsoon rainfall of the meteorological subdivisions of India from the multiple linear regression model (after Prasad and Singh, 1992)

The other measures of the forecast skill for the single and the combined model computed are found to be more or less the same and hence these measures are presented only for the single cononical model. The correlation coefficients between the observed and predicted rainfall (Table VI) are found to be significant for a contiguous area lying in west-central India. The root-mean-square errors for different meteorological subdivisions are of the order of 1 cm or less, which seems to be much smaller than the standard deviations of their mean rainfall values, and the model may be considered to be a stable one. The forecast seems to be positively biased, with values lying in the range 0.0-0.6. The absolute error for different meteorological subdivisions are found to be within the range 0.5-1.2.

It can be noted that the multiple regression model developed by Prasad and Singh (1992) for forecasting the monsoon rainfall of the meteorological subdivisions of India also shows a high positive skill score for west- central India (Figure 5) . However, the forecast skill for west-central India is found to be higher and also the area encompassed by the positive skill score (20.1) is found to be larger in the present case of the canonical correlation model. It is also worth noting that the first two predictors selected in the case of the single canonical correlation model and the multiple regression model are the same. These two predictors are the 500 hPa ridge axis position over India during April and the Darwin surface tendency from spring to winter, and they can be regarded as the most potential predictors of the Indian monsoon rainfall.

7. CONCLUSIONS

In this paper we have examined the usefulness of a canonical correlation model in the long-range forecast of the spatial variability of the monsoon rainfall of India. The model uses the summer monsoon rainfall of 29 contiguous meteorological subdivisions as predictands. The predictors considered have been the monthly sea- surface temperature of the central and eastern equatorial Pacific for five successive months preceding the monsoon (namely January through to May), the Darwin surface pressure tendency from spring to winter, and the 500 hPa ridge axis position over India during April. The rainfall of the extreme southernmost subdivision of India for April has also been considered as a predictor in the model. The model is developed on 30 years of recent data and tested on 16 independent years thereafter.

The model has been examined in the case of both a single model (containing the first significant canonical mode) as well as in the case of a combined model (containing the first and the second canonical modes together).

INDIAN MONSOON RAINFALL 1389

The results of the combined model show an overall improvement in the forecast skill over that of the single mode, without inclusion of much noise in the model.

A large contiguous meteorological subdivision of India is found to have a positive skill score (> 0.1). The skill score is found to be particularly high ( 2 0.3) for the meteorological subdivisions lying in west-central India. The correlation coefficients between the observed and the predicted rainfall are generally found to be significant. The root-mean-square errors and the absolute errors are found to be small (- 1 cm). The performance of the single model is found to be even better than that of the multiple regression model developed earlier by Prasad and Singh (1992) in which the first two predictors selected (i.e. the 500 hPa ridge axis position and the Darwin pressure tendency) are found to be identical. Thus, the model appears to be of importance in view of the long-range forecast of the spatial variability of the monsoon rainfall of India. A further refinement of the model may be possible by the inclusion of a new set of the additional predictors, which will be explored in the future.

ACKNOWLEDGEMENTS

We would like to thank Professor R. N. Keshavamurty, Director, and Dr S. S. Singh, Head, of the Forecasting Research Division, Indian Institute of Tropical Meteorology, b e , for encouragement and providing facilities. Thanks are also due to H. N. Bhalme and R. K. Verma for going through the initial version of the paper. We are also grateful to the anonymous referees and the editor Brian D. Giles for many useful suggestions and for improvement of the text.

REFERENCES

Bartlett, M. S. 1966. ‘The statistical significance of canonical correlations’, Biometrica, 32, 29-38. Banejee, A. K., Sen, P. N. and Raman, C. R. V. 1978. ‘On foreshadowing south west monsoon rainfall over India with midtropospheric

Bamett, T. P. 1981. ‘Statistical prediction of North American air temperature from Pacific predictors’, Mon. Wea. Rev., 109. 1021-1041. Bamett, T. P. and Hasselman, K. 1979. ‘Techniques of linear prediction with application to oceanic and atmospheric fields in the tropical

Bamett, T. P. and Preisendorfer, R. 1987. ‘Origins and levels of monthly and seasonal forecast skill for United States surface air temperature

Barnston, A. G. and Ropelewski, C. F. 1992. ‘Prediction of ENS0 episodes using canonical correlation analysis’, J. Climate, 5, 1316-1345. Bhalme, H. N., Jadhav, S. K., Mooley, D. A. and Ramana Murty, Bh. V. 1986. ‘Forecasting of monsoon performance over India’, J. Climarol.,

Bretherton, C. S., Smith, C. and Wallace, J. M. 1992. ‘An intercomparison of methods for finding coupled patterns in climate data’, J. Climate,

Cattell, R. B. 1966. ‘The scree test for the number of factors’, Multivur. Behuv. Res., 1, 245-276. Davis, R. E. 1976. ‘Predictability of sea surface temperature and sea level pressure over the north Pacific Ocean’, J. Phys. Oceanogr., 6,

Davis, R. E. 1977. ‘Techniques for statistical analysis and prediction of geophysical fluid systems’, Geophys. Asrrophys. Fluid Dyn., 8,

Deque, M. and Servian; J. 1989. ‘Teleconnections between tropical Atlantic sea surface temperatures and midlatitude 50-kPa heights during

Glahn, H. R. 1968. ‘Canonical correlations and its relationship to discriminant analysis and multiple regression’, J. Armos. Sci., 25, 23-31. Gowarikar, V., Thapliyal, V., Sarkar, R. P., Mandal, G. S. and Sikka, D. R. 1989. ‘Parametric and power regression modelsnew approach to

Graham, N. E., 1990. ‘Seasonal relations between tropical Pacific SSTs and Northem Hemisphere 700-mb heights’, Proceedings of the

Graham, N. E., Michaelsen, J. and Bamett, T. P. 1987a. ‘An investigation of the El Niiio Southern Oscillation cycle with statistical models, 1.

Graham, N. E., Michaelson, J. and Barnett, T. P. 1987b. ‘An investigation of the El Niiio Southern Oscillation cycle with statistical models, 2.

Hastenrath, S. 1987. ‘On the prediction of Indian monsoon rainfall anomalies’, J. Clirn. Appl. Meteorol., 26, 847-857. Hastenrath, S. 1988. ‘Prediction of Indian monsoon rainfall: further exploration’, J. Climate, 1, 298-304. Kung, E. C. and Sharif, T. A. 1980. ‘Regression forecasting of the Indian summer monsoon with antecedent upper air condition’, J. Appl.

Livezey, R. E. 1990. ‘Variability of skill of long-range forecasts and implications for their use and value’, Bull. Am. Meteorol. SOC., 71,

Metz, W. 1989. ‘Low frequency anomalies of atmospheric flow and the effects of the cyclone-scale eddies: A canonical correlation analysis’,

circulation anomaly of April’, Ind. J. Meteorol. Hydrol. Geophys., 29, 425-431.

Pacific’, Rev. Geophys. Space Phys., 17, 949-968.

determined by canonical correlation analysis’, Mon. Weu. Rev., 1 IS, 182W 850.

6, 347-354.

5, 541-560.

24S266.

245-277.

the period 196&1986’, J. Climate, 2,929-944.

long-range forecasting’, Mausam, 40, I 15122.

Fourteenth Annual Climate Diagnostics Workshop, NOAA Climate Analysis Centre, pp. 184-191.

Predictor fields characteristics’, J. Geophys. Res., 92, 14251-14270.

Model results’, J. Geophys. Res., 92, 1427144289,

Meteorol., 19, 370-380.

300-3 10.

J. Ahos. Sci., 46, 1026-1041.

1390 K. D. PRASAD AND S. V. SINGH

Mooley, D. A., Parthasarathy, B. and Pant, G. B. 1986. ‘Relationship between Indian summer monsoon rainfall and location of ridge at

Nicholls, N. 1984. ‘The stability of empirical long-range forecast techniques: a case study’, J. Clim. Appl. Meteorol., 23, 143-147. Nicholls, N. 1987. ‘The use of canonical correlation to study teleconnections’, Mon. Wea. Rev., 115, 393-399. Overland, J. E. and Preisendorfer, R. W. 1982. ‘A significance test for principal components applied to a cyclone climatology’, Mon. Weu.

Parthasarathy, B. and Pant, G. B. 1985. ‘Seasonal relationship between Indian summer monsoon rainfall and the Southern Oscillation’,

Parthasarathy, B., Sontakke, N. A., Munot, A. A. and Kothawale, D. R. 1987. ‘Droughts/floods in the summer monsoon season over different

Prasad, K. D. and S i b , D. R. 1982. ‘Study of empirical functions of the height fields over India and their relation with the rainfall’, Proc. Ind.

Prasad K. D. and Singh, S. V. 1992. ‘Possibility of predicting Indian monsoon rainfall on reduced spatial and temporal scales’, J. Climate, 5,

Shukla, J. and Mooley, D. A. 1987. ‘Empirical prediction of the summer monsoon rainfall over India’, Mon. Weu. Rec., 115, 695-703. Shukla, J. and Paolino, D. A. 1983. ‘The Southern Oscillation and long-range forecasting of the summer monsoon rainfall over India’, Mon.

Wea. Rec., 111, 18361837. Singh, S. V., Inamdar, S. R., Kripalani, R. H. and Prasad, K. D. 1986. ‘Relationship between 500 hPa ridge axis positions over the Indian and

the West Pacific regions and the Indian summer monsoon rainfall’, A h . Ahos . Sci., 3, 349-359. Thapliyal, V. 1982. ‘Stochastic dynamic forecast model for monsoon rainfall in Peninsular India’, Mausam, 33, 3 9 M 0 4 . Wright,P. B. 1989. ‘Homogenized long-period Southern Oscillation Indices’, Int. J. Climatol., 9, 33-54. Wallace, J. M., Smith, C. and Bretherton, C. S. 1992. ‘Singular value decomposition of sea surface temperature and 500-mb height anomalies’,

500 hPa level along 75”E’, J. Clim. Appl. Meteorol., 25, 631640.

Rev., 110, 1-4.

J. Climatol., 5, 369-378.

meteorological subdivisions of India for the period 1871-1984’, J. Climutol., 7, 57-70.

Acad. Sci. (Earth Planet Sci.), 91, 167-187.

1357-1361.

J. Climate, 5, 561-576.