geographically weighted regression and bayesian

10
International Journal of Humanities, Religion and Social Science ISSN : 2548-5725 | Volume 2, Issue 1 2017 www.doarj.org 21 www.doarj.org GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN GEOGRAPHICALLY WEIGHTED REGRESSION MODELLING WITH ADAPTIVE GAUSSIAN KERNEL WEIGHT FUNCTION ON THE POVERTY LEVEL IN WEST JAVA PROVINCE Ikin Sodikin 1 , Henny Pramoedyo 2 , and Suci Astutik 2 1 Master Student of Statistics Department, Brawijaya University, Malang, Indonesia; and 2 Lecturer of Statistics Department, Brawijaya University, Malang, Indonesia Abstract: GWR analysis is an expansion of a global regression analysis that generates parameter estimators to predict each point or location where the data is observed and collected. This analysis can accommodate spatial influence in an estimation of the regression model. One of the important issues that arise in GWR modeling is the non-constant variety between observations. Bayesian GWR analysis (BGWR) is considered as one of the best solutions to address the problems that arise in GWR modeling. Through the Bayesian approach, observations that potentially generate a non-constant variety can be detected and weighted directly so as to reduce their effect on model parameter estimation. In this study, the weights used are the adaptive Gaussian Kernel function, where the resulting bandwidth varies for each location of observation. This weighting is applied to compare the estimation results of GWR and BGWR model parameters. The results of the analysis show that the BGWR model is better than the GWR model in explaining the variables of literacy rate (%), percentage of households with joint latrine (%), and percentage of households receiving poor rice (%) to district poverty level in West Java Province. This is shown based on the Mean Square Error (MSE) value that is used as the model goodness criterion. The MSE value for the BGWR model is 0.353Γ—10 2 less than MSE for the GWR model of 0.382Γ—10 2 . Keywords: spatial, bayesian, Geographically Weighted Regression, adaptive gaussian kernel, non-constant variance, poverty I. Introduction As a developing country, Indonesia still has one of the most serious problems of poverty. To overcome the problem of poverty, the government has made various efforts, among others by estimating areas that are categorized as poor up to the level of village administration, in the hope that poverty alleviation will become more directed. The regression analysis approach has often been used in predicting poverty rates, but still global and enforced at all observed locations without involving geographical location based on earth's longitude and latitude. The spatial influences that arise caused the assumptions of freedom between observations required in global regressions are difficult to fulfill (A.S. Fotheringham, C. Brunsdon, and M. Charlton, 2002). One of the models that has been developed to overcome spatial problems is Geographically Weighted Regression (GWR). GWR analysis is an expansion of a global regression analysis that generates parameter estimators to predict each point or location where the data is observed and collected (A.S. Fotheringham, et al, 2002). This analysis can accommodate spatial influence in an estimation of the regression model. Let is an Γ—1 matrix of response variable, is an Γ—( + 1) matrix of explanatory variable, is -th Γ— matrix of location spatial weighted, is -th ( + 1)Γ—1 vector of parameters coefficient, and is -th Γ—1 matrix of error vector where ~(0, 2 ) (Y. Leung, C. L. Mei, and W. X. Zhang, 2000). Mathematically, the GWR model can be written as follows:

Upload: others

Post on 21-Apr-2022

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

International Journal of Humanities, Religion and Social Science ISSN : 2548-5725 | Volume 2, Issue 1 2017 www.doarj.org

21 www.doarj.org

GEOGRAPHICALLY WEIGHTED REGRESSION AND

BAYESIAN GEOGRAPHICALLY WEIGHTED REGRESSION

MODELLING WITH ADAPTIVE GAUSSIAN KERNEL

WEIGHT FUNCTION ON THE POVERTY LEVEL IN WEST

JAVA PROVINCE

Ikin Sodikin1, Henny Pramoedyo2, and Suci Astutik2 1 Master Student of Statistics Department, Brawijaya University, Malang, Indonesia; and

2Lecturer of Statistics Department, Brawijaya University, Malang, Indonesia

Abstract: GWR analysis is an expansion of a global regression analysis that generates parameter estimators to

predict each point or location where the data is observed and collected. This analysis can accommodate spatial

influence in an estimation of the regression model. One of the important issues that arise in GWR modeling is the

non-constant variety between observations. Bayesian GWR analysis (BGWR) is considered as one of the best

solutions to address the problems that arise in GWR modeling. Through the Bayesian approach, observations that

potentially generate a non-constant variety can be detected and weighted directly so as to reduce their effect on

model parameter estimation. In this study, the weights used are the adaptive Gaussian Kernel function, where the

resulting bandwidth varies for each location of observation. This weighting is applied to compare the estimation

results of GWR and BGWR model parameters. The results of the analysis show that the BGWR model is better than

the GWR model in explaining the variables of literacy rate (%), percentage of households with joint latrine (%),

and percentage of households receiving poor rice (%) to district poverty level in West Java Province. This is

shown based on the Mean Square Error (MSE) value that is used as the model goodness criterion. The MSE value

for the BGWR model is 0.353Γ—102less than MSE for the GWR model of 0.382Γ—102.

Keywords: spatial, bayesian, Geographically Weighted Regression, adaptive gaussian kernel, non-constant variance,

poverty

I. Introduction

As a developing country, Indonesia still has one of the most serious problems of poverty. To

overcome the problem of poverty, the government has made various efforts, among others by estimating

areas that are categorized as poor up to the level of village administration, in the hope that poverty

alleviation will become more directed. The regression analysis approach has often been used in predicting

poverty rates, but still global and enforced at all observed locations without involving geographical

location based on earth's longitude and latitude. The spatial influences that arise caused the assumptions

of freedom between observations required in global regressions are difficult to fulfill (A.S. Fotheringham,

C. Brunsdon, and M. Charlton, 2002). One of the models that has been developed to overcome spatial

problems is Geographically Weighted Regression (GWR).

GWR analysis is an expansion of a global regression analysis that generates parameter estimators

to predict each point or location where the data is observed and collected (A.S. Fotheringham, et al,

2002). This analysis can accommodate spatial influence in an estimation of the regression model. Let 𝒀

is an 𝑛×1 matrix of response variable, 𝑿 is an 𝑛×(𝑝 + 1) matrix of explanatory variable, π‘Ύπ’Š is 𝑖-th

𝑛×𝑛 matrix of location spatial weighted, πœ·π’Š is 𝑖-th (𝑝 + 1)Γ—1 vector of parameters coefficient, and πœΊπ’Š is

𝑖-th 𝑛×1 matrix of error vector where πœΊπ’Š~𝑁(0, 𝜎2) (Y. Leung, C. L. Mei, and W. X. Zhang, 2000).

Mathematically, the GWR model can be written as follows:

Page 2: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

22 www.doarj.org

π‘Ύπ’Šπ’€ = π‘Ύπ’Šπ‘Ώπœ·π’Š + πœΊπ’Š

Estimation of GWR model parameters for each 𝑖-location obtained through the Weighted Least

Square (WLS) method is written as follows:

οΏ½Μ‚οΏ½π’Š = (𝑿𝑇 π‘Ύπ’Š 𝑿)βˆ’1𝑿𝑇 π‘Ύπ’Š 𝒀

One of the important issues that arise in GWR modeling is the non-constant variety between

observations (H. S. Chan, 2008). This appears as a result of different regression coefficients in each

location of observation. Possible impacts are the variety of errors will also be different for each location

and non-fulfillment of the normality assumption of error.

The Bayesian GWR (BGWR) analysis introduced by Lesage, rated as one of the right solutions to

address the problems that arise in GWR modeling (J. P. LeSage, 2001). The Bayesian approach applied to

the GWR model is able to produce parameter estimators more effectively than the classical approach (I.

Ntzoufras, 2009). In BGWR analysis, the variance of errors is assumed to be not constant between the

observed locations i.e. πœΊπ’Š~𝑁(0, 𝜎2π‘½π’Š). π‘½π’Š is an 𝑛×𝑛 diagonal matrix containing parameters (𝑣1, 𝑣2, … , 𝑣𝑛) which indicates a non-constant variety between observational sites (H. S. Chan, 2008).

Unlike the estimation of GWR model parameters using Weighted Least Square (WLS) method (I.

M. Hutabarat, A. Saefuddin, A. Djuraidah, and I. W. Mangki, 2013), the BGWR model applies the Gibbs

Sampling algorithm. This algorithm is one of the simulation methods with the Monte Carlo Markov

Chain (MCMC) approach to generate sequential sample data from a certain posterior distribution, so a set

of estimations can be resulted approximate to the original joints posterior distribution of each parameter

(I. Ntzoufras, 2009). The posterior distribution is formed by combining the prior information and the

sample information expressed by the likelihood function.

The likelihood function of the BGWR model can be described as follows :

𝐿(𝒀|𝜷, 𝜎2, 𝑽) =1

(2πœ‹)𝑛/2

1

(𝜎2)𝑛/2∏

1

(𝑣𝑖)1/2

𝑛

𝑖=1

𝑒π‘₯𝑝 {βˆ’1

2𝜎2π‘£π‘–βˆ‘ (π’€π’Š

βˆ— βˆ’ π‘Ώπ’Šβˆ—πœ·)2

𝑛

𝑖=1}

𝐿(𝒀|𝜷, 𝜎2, 𝑽) =1

(2πœ‹)𝑛/2

1

(𝜎2)𝑛/2∏

1

(𝑣𝑖)1/2

𝑛

𝑖=1

𝑒π‘₯𝑝 {βˆ’1

2𝜎2π‘£π‘–βˆ‘ (π’€π’Š

βˆ— βˆ’ π‘Ώπ’Šβˆ—πœ·)2

𝑛

𝑖=1}

𝐿(𝒀|𝜷, 𝜎2, 𝑽) ∝ πœŽβˆ’π‘› ∏1

(𝑣𝑖)1/2𝑛𝑖=1 𝑒π‘₯𝑝 {βˆ’ βˆ‘

(πœΊπ’Š)2

2𝜎2𝑣𝑖

𝑛𝑖=1 }

where πœΊπ’Š = π’€π’Šβˆ— βˆ’ π‘Ώπ’Š

βˆ—πœ·, π’€π’Šβˆ— = π‘Ύπ’Šπ’€, dan π‘Ώπ’Š

βˆ— = π‘Ύπ’Šπ‘Ώ.

In this research, BGWR model completion using improper prior for each parameter as follows ( J.

Geweke, 1993) :

𝑓(πœ·π’Š) ∝ π‘π‘œπ‘›π‘ π‘‘π‘Žπ‘›π‘‘, 𝑓(𝜎) ∝ πœŽβˆ’1, and

𝑓 (π‘Ÿ

𝑣𝑖) ~𝑖𝑖𝑑

πœ’(π‘Ÿ)2

π‘Ÿ, 𝑖 = 1,2, … , 𝑛, so 𝑓(𝑽) ∝ ∏ 𝑣𝑖

βˆ’(π‘Ÿ+2)/2exp (

βˆ’π‘Ÿ

2𝑣𝑖)𝑛

𝑖=1

1.1 Joint Posterior Distribution

Based on the Bayes theorem and the assumption of mutually independent inter-prior distribution

𝑓(𝜷, 𝜎2, 𝑽) ∝ 𝑓(𝜷)×𝑓(𝜎)×𝑓(𝑽), so the joint posterior distribution can be written as follows:

𝑓(𝜷, 𝜎2, π‘½π’Š|𝒀, 𝑿) ∝ 𝐿(𝒀, 𝑿|𝜷, 𝜎2, π‘½π’Š)×𝑓(𝜷)×𝑓(𝜎)×𝑓(π‘½π’Š)

𝑓(𝜷, 𝜎2, π‘½π’Š|𝒀, 𝑿) ∝ [πœŽβˆ’π‘› ∏1

(𝑣𝑖)1/2

𝑛

𝑖=1

𝑒π‘₯𝑝 {βˆ’ βˆ‘(πœΊπ’Š)2

2𝜎2𝑣𝑖

𝑛

𝑖=1}] Γ—πœŽβˆ’1Γ— [∏(𝑣𝑖)βˆ’(π‘Ÿ+2)/2𝑒π‘₯𝑝 (

βˆ’π‘Ÿ

2𝑣𝑖)

𝑛

𝑖=1

]

𝑓(𝜷, 𝜎2, π‘½π’Š|𝒀, 𝑿) ∝ [πœŽβˆ’(𝑛+1) ∏(𝑣𝑖)βˆ’(π‘Ÿ+3)/2

𝑛

𝑖=1

𝑒π‘₯𝑝 {βˆ’ βˆ‘πœŽβˆ’2(πœΊπ’Š)2 + π‘Ÿ

2𝑣𝑖

𝑛

𝑖=1}]

Through the Gibbs sampling algorithm, the parameter data set is generated from the full

Page 3: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

23 www.doarj.org

conditional distribution in sequence which is subsequently used to form a joint posterior distribution.

From the joint posterior distribution of equation (5), full conditional distributions of each parameter in the

BGWR model can be established.

1.2 Full Conditional Posterior Distribution of πœ·π’Š

From the joint posterior distribution can be formed posterior distribution of πœ·π’Š conditional on

𝜎2 and 𝑽 as follows:

𝑓(πœ·π’Š|𝜎2, 𝑽) ∝ 𝑒π‘₯𝑝 {βˆ’ βˆ‘πœŽβˆ’2(πœΊπ’Š)2

2𝑣𝑖

𝑛𝑖=1 }

𝑓(πœ·π’Š|𝜎2, 𝑽) ∝ 𝑒π‘₯𝑝 {βˆ’ βˆ‘(π’€π’Š

βˆ—βˆ’π‘Ώπ’Šβˆ—πœ·)2π‘½π’Š

βˆ’πŸ

2𝜎2𝑛𝑖=1 }

The full conditional distribution of πœ·π’Š is a normal multivariate distribution with a mean value

οΏ½Μ‚οΏ½(𝒗) = [(π‘Ώπ’Šβˆ—)π‘»π‘½βˆ’πŸ(π‘Ώπ’Š

βˆ—)]βˆ’πŸ

[(π‘Ώπ’Šβˆ—)π‘»π‘½βˆ’πŸ(π’€π’Š

βˆ—)] and variance-covariance matrix 𝜎2[(π‘Ώπ’Šβˆ—)π‘»π‘½βˆ’πŸ(π‘Ώπ’Š

βˆ—)]βˆ’πŸ

or

can be written as:

𝑓(πœ·π’Š|𝜎2, 𝑽)~𝑡 [οΏ½Μ‚οΏ½(𝒗), 𝜎2[(π‘Ώπ’Šβˆ—)π‘»π‘½βˆ’πŸ(π‘Ώπ’Š

βˆ—)]βˆ’πŸ

]

1.3 Full Conditional Posterior Distribution of 𝜎2

From the joint posterior distribution can be formed posterior distribution of 𝜎2 conditional on

πœ·π’Š and 𝑽 as follows:

𝑓(𝜎2 |πœ·π’Š, 𝑽) ∝ πœŽβˆ’(𝑛+1)×𝑒π‘₯𝑝 {βˆ’ βˆ‘πœŽβˆ’2(πœΊπ’Š)2

2𝑣𝑖

𝑛

𝑖=1}

𝑓(𝜎2 |πœ·π’Š, 𝑽) ∝ πœŽβˆ’(𝑛+1)×𝑒π‘₯𝑝 {βˆ’1

2βˆ‘

(πœΊπ’Š)2/𝑣𝑖

𝜎2𝑛𝑖=1 }

As pointed out by Geweke (1993), the full positional conditional distribution of 𝜎2 following the

chi-square distribution with 𝑛 degrees of freedom, stated as follows:

𝑓 (βˆ‘(πœΊπ’Š)2/𝑣𝑖

𝜎2𝑛𝑖=1 |πœ·π’Š, 𝑽) ~πœ’(𝑛)

2

1.4 Full Conditional Posterior Distribution of 𝑽

From the joint posterior distribution can be formed posterior distribution of 𝑽 conditional on

πœ·π’Š and 𝜎2 as follows:

𝑓(𝑽|πœ·π’Š, 𝜎2) ∝ πœŽβˆ’π‘› ∏(𝑣𝑖)βˆ’(π‘Ÿ+3)/2

𝑛

𝑖=1

𝑒π‘₯𝑝 {βˆ’ βˆ‘πœŽβˆ’2(πœΊπ’Š)2 + π‘Ÿ

2𝑣𝑖

𝑛

𝑖=1}

𝑓(𝑽|πœ·π’Š, 𝜎2) ∝ ∏ (𝑣𝑖)βˆ’(π‘Ÿ+3)/2𝑛𝑖=1 𝑒π‘₯𝑝 {βˆ’

1

2βˆ‘

πœŽβˆ’2(πœΊπ’Š)2+π‘Ÿ

𝑣𝑖

𝑛𝑖=1 }

As pointed out by Geweke (1993), the full positional conditional distribution of 𝑽 following the

chi-square distribution with π‘Ÿ + 1 degrees of freedom, where π‘Ÿ is a hyperparameters of 𝑽. The full

positional conditional distribution of 𝑽 stated as follows:

𝑓 ([πœŽβˆ’2(πœΊπ’Š)2+π‘Ÿ

𝑣𝑖] |πœ·π’Š, 𝜎2) ~πœ’(π‘Ÿ+1)

2

1.5 Gibbs Sampling Algorithm

The process of Gibbs Sampling algorithm can be shown through the following steps [4]:

1) Determine the value of initiation randomly for parameters πœ·π’ŠπŸŽ, 𝜎2(0), and π‘½πŸŽ,

2) For each observation 𝑖 = 1 to 𝑛 :

a. Draw πœ·π’ŠπŸ from 𝑓(πœ·π’Š|𝜎2(0), π‘½πŸŽ) use equation (7),

b. Draw 𝜎2(1) from 𝑓(𝜎2|πœ·π’ŠπŸ, π‘½πŸŽ) use equation (9), and

c. Draw π‘½πŸ from 𝑓(𝑽|πœ·π’ŠπŸ, 𝜎2(1)) use equation (11).

Page 4: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

24 www.doarj.org

3) Change the values πœ·π’ŠπŸŽ, 𝜎2(0), and π‘½πŸŽ in step 1) with πœ·π’Š

𝟏, 𝜎2(1), and π‘½πŸ,

4) Repeat step 2) up to 3) as much as π‘ž times (iteration) to convergence,

5) Eliminate the first 𝑐 drawing (Burn-in Period) to reduce the influence of initial values or initiation,

and

6) Perform convergence checks by calculating MC error and testing significance parameters using 95%

credible intervals.

The use of weighting function is still trial and error, so the related research of GWR especially

the selection of weighty function is usually based on previous research. Some research on BGWR has

been done and developed using fixed weighting function, both fixed Gaussian and fixed bi-square kernel.

BGWR analysis with different bandwidth in each location has not been done, so this study will use the

adaptive Gaussian kernel weighting to compare GWR and BGWR analyzes in cases of poverty in West

Java Province.

II. Methods

The area of this research is the Province of West Java which consists of 27 districts (Figure 1).

This study uses data which obtained from BPS in 2016, include: the percentage of district poverty rates,

literacy rates, households with shared latrines and poor rice recipient households.

Analysis procedure is: The first procedure is spatial heterogeneity testing of poverty rate data

using Breusch Pagan’s Test [2]. The second procedure is calculating the weight matrices with Adaptive

Gaussian Kernel function, first step is calculating the Euclidean distance and determine the optimum

bandwidth. The next is estimating and testing GWR model parameters along with model precision testing

based on Mean Square Error (MSE). The next procedure is, establish BGWR modelling by Gibbs

Sampling's algorithm based on MCMC Simulation [4]. Simulations were performed as many as 550

iterations until convergent with the first 50 iterations omitted to eliminate the effect of initiation values.

Convergent of output simulations can be seen from MC Error values. For calculation of MC Error value,

the simulation output is divided into 50 batches. BGWR parameter estimators are generated from the

mean values of 500 simulated outputs that have converged. For testing the significance of BGWR

parameters, the next procedure is to calculate the 95% Credible Interval value by determining the lower

limit of the percentile 2.5% and the upper limit of the 97.5 percentile. Last procedure, is compare the

estimation results of GWR and BGWR models based on the MSE value criteria.

There are some statistic software that used in this research. They are R-Studio 1.0.143 (spgwr

packages) for GWR analysis, Matlab R2012a for BGWR analysis, and ArcGIS 9.0 for mapping the result

of analysis.

Figure 1. Map of West Java Province (27 districts/city)

Page 5: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

25 www.doarj.org

Description of districts name in Figure 1: 1. Bogor 15. Karawang

2. Sukabumi 16. Bekasi

3. Cianjur 17. Bandung Barat

4. Bandung 18. Pangandaran

5. Garut 19. Bogor City

6. Tasikmalaya 20. Sukabumi City

7. Ciamis 21. Bandung City

8. Kuningan 22. Cirebon City

9. Cirebon 23. Bekasi City

10. Majalengka 24. Depok City

11. Sumedang 25. Cimahi City

12. Indramayu 26. Tasikmalaya City

13. Subang 27. Banjar City

14. Purwakarta

III. Results and Discussion

3.1 Data Description The data descriptions of the four variables include the range, maximum, minimum, average and standard

deviation values can be seen in Table 1.

Range and standard deviation of the regencies in West Java Province pavority rate on Table 1 are 48.58 and

8.61. Neither are household with shared latrines and households receiver of poor rice also have high range and

standard deviation. These indicates that the research variables are rather varied at each regencies in West Java

Province.

Table 1. Description of Research Data

Research Variables Range

(%)

Min

(%)

Max

(%)

Mean

(%)

Standard

Deviation

(%)

Level of Poverty (π‘Œ) 48.58 2.40 50.98 11.35 8.61

Literacy Rate (𝑋1) 3.43 96.57 100.00 99.44 0.94

Household With Shared

Latrines (𝑋2) 31.30 68.25 99.55 89.75 7.66

Households Receiver of Poor

rice (𝑋3) 73.62 11.12 84.74 50.92 20.39

Distribution map of regencies poverty level in West Java Province can be seen in Figure 2.

Regency that has the highest poverty level in West Java Province is in the west region especially

Indramayu regency. This is because the area is located far from the provincial capital and is a border area

with the outskirts of Central Java Province.

Page 6: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

26 www.doarj.org

Figure 2. Map of Poverty Level in West Java Province (27 districts/city)

In general, in the regression analysis the data of the research variables to be used should be

ascertained in such a way that the explanatory variables have an influence on the response variable and

also to satisfy the assumption of the absence of multicollinearity. Therefore, it is necessary to calculate

the Pearson correlation value among the research variables as shown in Table 2.

Table 2. Pearson correlation value between research variables

Variables Pearson

correlation value p-value

π‘Œ-𝑋1 -0.549 0.003

π‘Œ-𝑋2 -0.423 0.012

π‘Œ-𝑋3 0.606 0.001

𝑋1-𝑋2 0.299 0.130

𝑋1-𝑋3 -0.120 0.549

𝑋2-𝑋3 -0.210 0.170

Table 2 shows that for each variable 𝑋3 and 𝑋3 has a strongly positive and negative relationship

with the response variable π‘Œ. While the variable 𝑋2 has a weak negative relationship. All explanatory

variables can be used in this study because all the relationships between explanatory variables are weak.

This indicates the absence of symptoms of multicolinearity.

3.2 Breusch Pagan Test

Conditions in a district is influenced by the condition of the surrounding district. In addition, the

socio-cultural, economic, and geographical conditions of a district are different from those in other

districts. It shows the effect of spatial diversity. The Breusch Pagan test yields a BP statistical value of

33.773. The value is greater than the critical point of test πœ’(0.05)(4)2 = 9.487, so that there is spatial

diversity. In other words, there are differences in explanatory variables related to poverty levels between

one district and other districts in West Java. The existence of this spatial diversity can be overcome with

GWR and BGWR.

Page 7: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

27 www.doarj.org

3.3 The Results of Adaptive Gaussian Kernel GWR Analysis

The next table provided to explain the estimation parameters description of GWR analysis.

Table 3. Description Estimator of GWR Adaptive Gaussian Kernel Model

𝛽0 𝛽1 𝛽2 𝛽3

Minimum 43.740 -5.740 0.038 0.146

1st Quartile 55.280 -5.696 0.055 0.152

Median 96.620 -1.058 0.111 0.169

3rd Quartile 536.900 -0.584 0.286 0.299

Maximum 540.700 -0.449 0.291 0.302

The negative value on the coefficient 𝛽1 indicates that the explanatory variable of Literacy (𝑋1)

contributes negatively to the poverty rate of West Java province. The higher percentage of people who are

literate in the last year will reduce the poverty level of the district / city. While positive values on the

coefficients 𝛽2 and 𝛽3 show that the explanatory variables 𝑋2 and 𝑋3 contribute positively to the response

variable, so the higher the percentage of households using joint latrines or the higher percentage of

households receiving poor rice, it will increase the percentage of poor people in the district /city.

The estimated parameters of the GWR model are tested partially to show that the parameters have

significant effect or not to the level of poverty in west Java Province. Partial test is done by using 𝑑 test

statistic, where if 𝑑 test statistic is bigger than critical point 𝑑(𝛼/2)(π‘›βˆ’π‘βˆ’1) then it can be decided

parameters have significant effect. The explanatory variables that have a significant influence on the

poverty rate for each district / city in West Java Province are presented in Table 4.

Table 4. District/city Grouping based on The Explanatory Variables with Significant Effect

in GWR Model

Variables District/city

𝑋1 and 𝑋3 Indramayu, Cirebon, Majalengka, Kuningan, Ciamis, Tasikmalaya,

Pangandaran, Kota Cirebon, Tasikmalaya City, dan Banjar City

𝑋3 Sukabumi

All variable not

significant

Karawang, Bekasi, Subang, Cianjur, Purwakarta, Bogor, Sumedang,

Bandung Barat, Garut, Bandung City, Cimahi City, Bogor City,

Sukabumi City, dan Bekasi City

The variation of literacy rate and household of poor rice recipients have a significant effect on

poverty level in most of western of West Java Province. For the middle region there are no variables that

have significant effect. It can be shown visually in Figure 3.

Page 8: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

28 www.doarj.org

Figure 3. Map of District/city Grouping based on The Explanatory Variables with Significant Effect

in GWR Model

3.4 The Results of Adaptive Gaussian Kernel Bayesian GWR (BGWR) Analysis

In BGWR, regression coefficients are estimated through MCMC simulation process with Gibbs

Sampling algorithm. The hyperparameter value π‘Ÿ for parameter 𝑽 in equation (11) is determined based on

previous researches is 8, 15, 25, and 35. The simulation process of MCMC with Gibbs Sampling

algorithm of 550 iterations shows convergent results at the 51th iteration, so it is decided to eliminate first

50 outputs to reduce the influence of initiation. This is showed by the MC error of less than 1% of the

standard deviation of the simulated output and the trace dynamic plot which tends to follow the horizontal

line pattern in Figure 4.

(a) (b) (c)

Figure 4. MC Error and Trace Dynamic Plot of MCMC Simulation Output with hyper parameter

π‘Ÿ=8

The estimated parameters of the BGWR model were obtained from the average 500 MCMC

simulation output of the Gibbs Sampling algorithm. Based on these values, a 95% credible interval is

0

0.005

0.01

0.015

1 3 5 7 9 11 13 15 17 19 21 23 25 27

betha1-i r=8

1%stdev MCE

0

0.0005

0.001

0.0015

0.002

1 3 5 7 9 11 13 15 17 19 21 23 25 27

betha2-i r=8

1%stdev MCE

0

0.0001

0.0002

0.0003

0.0004

0.0005

1 3 5 7 9 11 13 15 17 19 21 23 25 27

betha3-i r=8

1%stdev MCE

-10.0

-8.0

-6.0

-4.0

-2.0

0.0

12

44

77

09

31

16

13

91

62

18

52

08

23

12

54

27

73

00

32

33

46

36

93

92

41

54

38

46

14

84

50

75

30

trace of betha1-i

0.0

0.1

0.2

0.3

0.4

0.5

0.6

1

23

45

67

89

11

1

13

3

15

5

17

7

19

9

22

1

24

3

26

5

28

7

30

9

33

1

35

3

37

5

39

7

41

9

44

1

46

3

48

5

50

7

52

9

trace of betha2-i

0.0

0.1

0.2

0.3

0.4

0.5

0.6

1

25

49

73

97

12

1

14

5

16

9

19

3

21

7

24

1

26

5

28

9

31

3

33

7

36

1

38

5

40

9

43

3

45

7

48

1

50

5

52

9

trace of betha3-i

Page 9: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

29 www.doarj.org

calculated to show that the parameters have a significant effect or not to the level of poverty in West Java.

A 95% credible interval is shown with a 2.5% lower percentile limit and a 97.5% percentile upper limit.

A parameter can be said to be significant if the 95% credible interval does not contain a zero.

The explanatory variables that have significant effect on the poverty level for each district / city

in West Java Province based on the credible interval value on the BGWR model give consistent results

for various values of hyperparameter π‘Ÿ. District groupings based on explanatory variables that have a

significant effect on the BGWR model are presented in Table 5.

Table 5. District/city Grouping based on The Explanatory Variables with Significant Effect

in BGWR Model

Variables District/city

𝑋1, 𝑋2, and 𝑋3

Bekasi, Bogor, Sukabumi, Tasikmalaya, Pangandaran, Indramayu,

Cirebon, Majalengka, Kuningan, Garut, Ciamis, Sukabumi City,

Bekasi City, Depok City, Bogor City, Cirebon City, Tasikmalaya

City, dan Banjar City

𝑋1 and 𝑋2 Subang, Purwakarta, Sumedang, Bandung, Cianjur, Bandung City,

dan Cimahi City

𝑋3 Karawang dan Bandung Barat

All variables significantly influence the level of poverty districts in the western and western

suburbs of West Java Province. While the middle region is more dominated by the variables 𝑋1 and 𝑋2.

Compared with the GWR model, the three BGWR model explanatory variables have significant influence

in almost all districts. It can be shown visually in Figure 5.

Figure 5. Map of District/city Grouping based on The Explanatory Variables with

Significant Effect in BGWR Model

3.5 Selection of Best Model

One way to choose the best model is to compare the mean square error value or Mean Square

Error (MSE) as an indicator of the accuracy of a model (goodness of fit). Smaller MSE values tend to

show better models. Comparison of MSE values for GWR and BGWR models can be seen in Table 6.

Page 10: GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN

Geographically Weighted Regression and Bayesian Geographically Weighted …

30 www.doarj.org

Table 6. MSE values for GWR and BGWR models

MSE GWR

(Γ—102)

MSE BGWR (Γ—102)

r=8 r=15 r=25 r=35

0.382 0.371 0.366 0.359 0.353

Based on Table 6, the BGWR model with hyper parameter r = 35 is the best BGWR model with

the smallest average MSE value that is 0.353 Γ— 102. Table 6 also shows that the mean value of MSE

decreases as the value of hyper parameter π‘Ÿ increases.

IV. Conclusion

The result of BGWR analysis with Adaptive Gaussian Kernel weighing function with various

hyperparameter π‘Ÿ indicated that π‘Ÿ = 35 is hyperparameter which form the best BGWR model for

estimation of district / city poverty level in West Java Province with the smallest mean value of MSE that

is equal to 0.353 Γ— 102. Similarly, when compared with the GWR model, the BGWR model is still more

suitable model for use in district/city level poverty modeling in West Java Province.

REFERENCES

A.S. Fotheringham, C. Brunsdon and M. Charlton. (2002). Geographically Weighted Regression: The

Analysis of Spatially Varying Relationship. England: John Wiley & Sons Ltd.

H. S. Chan. (2008). Incorporating The Concept of Community Into A Spatially Weighted Local

Regression Analysis, Canada:University of New Brunswick.

I. M. Hutabarat, A. Saefuddin, A. Djuraidah and I. W. Mangki, (2013). Estimating the Parameters

Geographically Weighted Regression (GWR) with Measurement Error, Open Journal of

Statistics. 3,417-421. doi:10.4236/ojs.2013.36049. http://dx.doi.org/10.4236/ojs.2013.36049.

I.oNtzoufras. (2009). Bayesian Modelling Using WinBUGS. USA: John Wiley & Sons Inc.

J. Geweke, (1993). Bayesian Treatment of the Independence Student-t Linear Model. Journal of

Applied Econometrics. 8,S19-S40. http://www.jstor.org/stable/2285073.

J.P. LeSage. (2001). A Family of Geographically Weighted Regerssion Models, Ohio:University of

Toledo.

Y. Leung, C. L. Mei and W. X. Zhang, (2000). Statistical Tests for Spatial Nonstationarity based on

The Geographically Weighted Regression Model, Environment and Planning A, 32, 9-32.

doi:10.1068/a3162. https://doi.org/10.1068/a3162.