a review of chronic diseases map and data trends analysis · visualization is the art of presenting...
TRANSCRIPT
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
73
A Review of Chronic Diseases Map
and Data Trends Analysis
Mustafa Yousef Mustafa ZGHOUL
Abstract — In this paper, an attempt is made to review
and analysis the status of the chronic disease the gulf
region. In addition, illustrate the methods of visualization
of the chronic diseases information, which are the major
disease burden in the gulf region. This paper, is reviewed
and discussed the methods of analysis the chronic diseases
data with reference to univariate time series model. The
Forecasting model based simple linear regression is
proposed and implemented. The Visualization of digital
data is considering as one of the most important
techniques for presenting the distribution of statistical
data within a small space. An interactive disease map is
proposed and implemented to determine the chronic
diseases numbers and locations.
Keywords: Chronic diseases, visualization, statistical data
analysis, linear regression, time series, forecasting, mapping
I. INTRODUCTION
Chronic diseases (CD) pose a constant danger and a large
force because of their increased risk to human life and the
economy of nations. Statistics issued by the World Health
Organization (WHO) indicate that the number of people living
in chronic disease increasing in all parts of the world as
illustrated in Figure 1.
Fig. 1: The total NCD deaths by region 2012
(AFR: Africa; AMR: America; SEAR: South East Asia; EUR: European;
EMR: Eastern Mediterranean; WPR: Western Pacific; Source WHO)
Masters Student in Computer Science, Faculty of Computing and IT, Sohar
University, Sohar, Sultanate of Oman, [email protected]
In addition, WHO reported that the chronic diseases are the
major disease burden in the gulf region, "the chronic diseases
cause more than 60% of all deaths in the Gulf Cooperation
Council (GCC) countries [1] as shown in Figure2.
Fig. 2: The Diabetes Growth Rate and Affected Population (in ‘000)
during 2000 to 2030, Source WHO.
Moreover, depending on the ministry of health statistics in
the Sultanate of Oman during the period (1990 - 2005),
approximately 75% of diseases burden is attributable to chronic
diseases" [2]. The total number of death because of chronic
diseases are a double number of death caused by infectious
diseases, like pulmonary tuberculosis, viral hepatitis (A),
malaria, and AIDS (HIV). In addition, the chronic diseases
affect women and men at younger ages. There are a number of
risk factors causes the chronic diseases including physical
inactivity, unhealthy diet and use of tobacco. The chronic
disease is a factual dangerous and it is growing rapidly over the
time, Therefore, many governments spent billions of dollars for
controlling and preventing the expansion of CD [3]. WHO
reports indicate that the occurrence of CD will be raised
dramatically in 2023 as shown in Figure 3.
For this reason, one of the solutions to reduce the rate of
chronic diseases is increasing the health awareness in the
community. Therefor, design and implement a web application
for the chronic diseases surveillance is much needed, which
will help in raising the knowledge of culture-related chronic
diseases and methods of reducing its effects.
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
74
Fig. 3: Expected rate of CD in 2023 (Source WHO)
The chronic disease (or non-communicable disease) is a
disease that remains or continues for a long period of time,
starts from about three months or more. Generally, these
diseases cannot be prevented using a treatment or take a
medicine. Ministry of health in Sultanate of Oman is working
on educated members of the community against the chronic
diseases like diabetes, blood pressure, etc., in order to increase
the health awareness. The infectious disease (or communicable
disease) is a disease that generated by microorganisms, like
bacteria, parasites, fungi and viruses. In addition, can be
infected other people quickly through sneezing and coughing,
or through physical contact. Examples of infectious diseases
are malaria, pulmonary tuberculosis, viral hepatitis (A) and
AIDS (HIV) [4]. Therefore, the government’s wok hard to
reduce the expansion of the chronic disease and infectious
diseases by offering intensive programs and studies of injuries
prevention methods [5]. The main objective of this paper is to
promote awareness of the expansion of chronic diseases in
GCC. Therefore, design and implement a web application for
analyzing, forecasting and visualizing chronic diseases is much
needed. This system will collect the data of chronic disease and
illustrate the analysis statistics rates in all GCC. In addition, it
will provide a very clear visualization figures based on
interactive disease map.
II. VISUALIZATION METHODS AND TECHNIQUES
Many techniques are used to visualize and analysis the
data. The interactive disease map is one of a powerful method
for data visualization. It is used to illustrate information clearly
and efficiently via plots, statistical and information graphics.
The visualization of statistical data aims to present a large
amount of information in the short time. Numerical data can be
graphed using lines, bars or dots to visual communicate a
quantitative message. Recent studies proved that the interactive
disease map helps the users to analyze and understand the
meaning of complex data easily. Generally, tables are used
where the users look up to a specific measurement, while
charts and maps are used to present patterns or relationships of
multivariate data. The diseases map is a method for
representing different diseases for tracking the expansion of
diseases in real time and understanding the reasons of these
diseases [6]. A time series is a group of observations or
statistics which being recorded or collected at regular intervals
and sorted by the time [7]. Therefore, the proposed system will
provide an accurate analysis for predicting unseen data over
time. Moreover, the trend analysis is a type of technical
analysis that tries to predict the future values of data
(information in sequence over time) based on past data [8],[9].
The analyze of the statistical data based on trends analysis
method is called linear regression analysis and which will help
the decision makers in the ministry of health to determine the
expansion and needs of chronic diseases [10],[11].
Cartographic visualization is used symbolism techniques,
which is referred to locations inside the map and represent their
multiple data values [12]. In addition, it allows the users to
select statistical and geographical subsets. Local statistics of
univariate distributions can be calculated and visualized in a
dynamic figure for exploration like scatterplots, dot-plots,
population cartograms, choropleths, parallel coordinates plots,
and polygon maps as shown in Figure 4.
Fig. 4: Cartographic visualization source [12]
According to Statistics Netherlands (2012), "data
visualization is the art of presenting data in a visual manner.
So the data becomes apparent. Data visualization is a helpful
tool for all phases of the statistical process. It has two goals in
statistics: data exploration and communication" [13].
Diagrams can detect the pattern in large amounts of data. This
method views the major findings in the data to the users. There
are many types of diagrams, like radar plot, bar chart, and line
chart. It used to represent the time series or to make a
comparison between two variables. On the other hand,
proportional symbol map is a good way of viewing statistical
data. It aims to view the characteristics of a subject on the map.
"In a proportional symbol map, a symbol is plotted on the
center of a region, and the surface area of the symbol is scaled
with the value of the variable" [13]. For example, consider the
proportional symbol map as shown in Figure 5.
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
75
Fig. 5: Proportional symbol map source [13]
According to Heinrich Hartmann (2016), "statistical
techniques are the art of extracting information from data.
Moreover, one of essential data analysis methods is
visualization. The human brain can process geometric
information much more rapidly than numbers." [14]. There are
many methods for visualization, like rug plots, histograms (for
one-dimensional data), scatter plots (for two-dimensional data),
line plots. [14]. as an example, consider the proportional
symbol map as shown in Figure 6.
Fig. 6: Visualization methods source [14]
The histogram is a most popular visualization method. In
addition, it is commonly used to show the distribution of
numerical data. [14] According to Ben Fry (2008), "within a
small space, the visual can communicate (or announce) more
information than the table." [15]. Mapping is one of the
modern techniques, which is used to visualize the information.
The process of mapping starts with collect the data values and
store it inside data set (database). Then design (or draw) the
map to represent the data values on it. After that determine
some locations (or points) on the map, usually these locations
should be the center of each governorate. Afterward use
functions to load the set of data values from the database,
which will show on the map itself. As an example, consider the
visualization of varying data by symbol size on the map as
shown in Figure 7. [15]
Fig. 7: Visualization of varying data by symbol size on map
source [15]
According to Adriana REVEIU, Marian DARDALA
(2011), cartographic visualization provides the facilities to
represent statistical data. It is one of the most important tools in
geographical information systems (GIS). In addition, it aims to
show the distribution of statistical data inside the regional map.
Moreover, it uses different symbols to show more information
at the same time inside the map. These symbols view
quantitative details for the users. [16]
III. ANALYSIS OF CHRONIC DISEASES
The analysis of chronic diseases often depends on
longitudinal cohort (or group) studies, and there are many
methodological issues related to chronic disease studies, the
first one is based on changing definitions of risk factors and
outcomes over time. Moreover, the second issue is missing
data.
To perform analysis in the existence of missing data, there
are many procedures to do that: the first one, make analysis to
individuals which data is complete. The second one, insert the
existing values to individuals which data is incomplete and
then analyzing the dataset. Moreover, the second analytical
technique is most appropriate. There are many analytic
techniques for chronic diseases modeling: [17]
1. Logistic regression analysis: this can examine the effects
of risk factors on the development of the disease.
2. Survival analysis (time to event data): this deal with time
until the event of interest occurs.
3. Neural networks.
4. Tree-based classification methods.
5. Longitudinal data analysis (mixed models, generalized
linear models and generalized estimating equations).
The formula for representing the regression analysis model
is:
Yi = a + b * Xi
Where the regression parameters, a: is the intercept (on the
y axis) . And b: is the slope of the regression line
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
76
The Least Squares method used to estimate the slope and
intercept regression parameters, as the following:
The formula for calculating slope regression parameter (b)
is:
The formula for calculating intercepts regression parameter
(a) is:
IV. LITERATURE SURVEY
This section presents the literature survey of the chronic
diseases data analysis. The researchers were implemented
different computing techniques for forecasting and visualizing
the unseen data. It will cover the expansion rate of chronic
diseases in the global. In addition, the forecasting models and
visualization techniques are included. (WHO) global report [3]
statistics shows that the number of deaths in 2005 is about (58)
million, around (35) million approximately of deaths resulted
by chronic diseases. It represents about 60% of global deaths,
which caused by chronic diseases, "only 20% of chronic
disease deaths occur in high-income countries, while 80%
occur in low and middle-income countries, where most of the
world’s population lives". The chronic diseases will rise about
70% of the total deaths in the world at 2030. Khatib O. [18]
stated, "chronic diseases are the major disease burden in the
Eastern Mediterranean Region. There are many risk factors
associated with chronic diseases, most of them are related to
the lifestyle and can be controlled. Such as low vegetable and
fruit intake, physical inactivity, high fast food consumption and
high cholesterol are dominant causes of cardiovascular disease
and some types of cancer. Also obesity and overweight can
raise the risk of chronic diseases, like heart disease and
diabetes". There are many approaches can help to deal with this
problem such as developing national strategies, policies, and
plans for prevention and health care. Also, implement and
enhance the community participation in prevention and health
care. Most of the solutions for preventing chronic diseases are
expensive [18]. Gregory Hartl & Menno van Hilten (2012) [1],
explains that the non-communicable diseases (NCDs) are
caused more than 60% of all deaths in the Gulf Cooperation
Council (GCC) countries. The risk factors are an unhealthy
diet, the use of tobacco and the physical inactivity. The Gulf
Cooperation Council (GCC) countries (Saudi Arabia, Bahrain,
Sultanate of Oman, Kuwait, Qatar, and United Arab Emirates)
are adopting a regional strategy to address, prevent and control
of non-communicable diseases (NCDs), like diabetes, cancer,
and chronic respiratory disease. It aims to reduce exposure to
people from different risk factors and improving the services of
preventing and treating the health problems [1]. Hill A.G., et
al. (200), mentioned that the awareness about the chronic
diseases has been grown in Oman. Morbidity for diagnosis
related to chronic diseases, especially cancer, cardiovascular
disease, and the endocrine disease was growing and becoming
a significant share of Oman's burden of disease [19]. Jawad A.
Al-Lawati et al. (2008), believes that the chronic diseases are
posed the main challenge for Omani population [2]. Depending
on the ministry of health statistics in Oman during the period
(1990 to 2005), approximately 75% of diseases burden is
attributable to chronic diseases. "The distribution of chronic
diseases and related risk factors among the general population
is similar to that of industrialized nations: 12% of the
population has diabetes, 30% is overweight, 20% is obese,
41% has high cholesterol, and 21% has the metabolic
syndrome". They conclude that the chronic diseases are the
major exhaustion on human and financial resources for
Sultanate of Oman, and this will affect the advances in the
health care system that has been achieved. Similarly, some
related works focus on using visualization techniques like
interactive map. Jason Dykes (1998), implemented a
cartographic visualization for locating symbols on a plane to
show the statistical distributions of one or more variables" [12].
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
77
V. RESULTS AND DISCUSSIONS
Depends on statistical data which collected from ministry
of health in sultanate of Oman, it shown that the prevalence of
diabetes in Oman is increasing over years. as illustrated in
Figure 8.
Figure 8: Diabetes in Oman (source MoH Oman)
After applying the equation of simple linear regression, the
equation will be (Y = 4319.3 + 127.9 * X). and the value of R-
squared variable equals to (0.46), this value will describe how
data fitted to the regression line. As shown in Figure 9.
Figure 9: Forecasting results for diabetes in Oman after 5 years
The results of this forecasting model shown that the
prevalence of diabetes in sultanate of oman will increase. And
this will help the decision makers to take attention about spread
of chronic diseases.
This research paper will implement the simple linear
regression method for statistical data analysis and forecasting
because the collected data is one variable changed over time. In
addition, this paper will select the cartographic visualization
method to show the distribution of the data among different
geographic locations. In addition, it uses symbols to represent
the data; this symbol is scaled with the value of the variable.
CONCLUSION & FUTURE RESEARCH DIRECTION
This paper provided an overview of the chronic diseases in
GCC. First, it presents the risk factors that causes the chronic
diseases. Then it explains the different methods for statistical
data analysis and forecasting models. Moreover, presents
different methods for data visualization techniques is presented
and reviewed. The Cartographic technique is appropriate for
visualization of Chronic Diseases data because it illustrates the
distribution of the data among different geographic locations.
The literature survey proved that a simple linear regression is
an appropriate method for data analysis and forecasting data
with one variable only.
REFERENCES
[1] Gregory Hartl, Menno van Hilten (January 2012). “WHO recognizes
progress of Gulf States for adopting regional strategy to address non-communicable diseases", [Online], available at:
http://www.who.int/mediacentre/news/statements/2012/ncds_20120106/en/
(accessed 15 September 2016)
[2] Al-Lawati JA, Mabry R, Mohammed AJ (July 2008), “Addressing the
Threat of Chronic Diseases in Oman", CDC, VOLUME 5: NO. 3.
[3] World Health Organization (2005), WHO global report: "Preventing
chronic diseases: A vital investment". Geneva: ISBN 92 4 156300 1.
[4] “What are infectious diseases?", [Online], available at: http://www.yourgenome.org/facts/what-are-infectious-diseases (accessed 29
July 2016)
[5] “Directorate General for Disease Surveillance and Control in Ministry of
Health - Sultanate of Oman”, [Online], available at:
https://www.moh.gov.om/en/web/directorate-general-of-disease-surveillance-control/ (accessed 02 June 2016)
[6] “Mapping Disease", [Online], available at: http://www.the-scientist.com/?articles.view/articleNo/35349/title/Mapping-Disease/ (accessed
23 July 2016)
[7] "Time Series", [Online], available at: http://www.statslab.cam.ac.uk/~rrw1/timeseries/t.pdf (accessed 03 October
2016)
[8] "Trend Analysis", [Online], available at:
http://pubs.usgs.gov/twri/twri4a3/pdf/chapter12.pdf (accessed 15 June 2016)
[9] “Time trends analysis", [Online], available at: https://ec.europa.eu/jrc/sites/default/files/epaac-wp9-session2-crocetti.pdf
(accessed 02 July 2016)
[10] Jabar H. Yousif, Classification of Mental Disorders Figures based on Soft
Computing Methods. International Journal of Computer Applications
117(2):5-11, May 2015.
[11] Jabar H. Yousif and Mabruk A. Fekihal. Neural Approach for
Determining Mental Health Problems. Journal of Computing, Vol.4, Issue 1,
pp6-11 ISSN 2151-9617 ,NY, USA, January 2012.
[12] Jason Dykes (1998), "Cartographic visualization: exploratory spatial data
analysis with local indicators of spatial association using Tcl/Tk and cdv", UK, The Statistician, 47, Part 3, pp. 485-497.
[13] Edwin de Jonge (2012), "Data Visualization", Netherlands, Statistics
Netherlands, ISSN: 1876-0333.
[14] Heinrich Hartmann (2016), "Statistics for Engineers", USA, ACM Queue
10.1145/2890780.
[15] Ben Fry (2008), "Visualizing Data", USA, O’Reilly, ISBN-10: 0-596-
51455-7.
[16] Adriana REVEIU, Marian DARDALA (2011), "Techniques for
Statistical Data Visualization in GIS", Romania, Informatica Economica, vol.
15, no. 3/2011
[17] Ralph D'Agostino, Lisa M. Sullivan (January 2002), "Chronic Disease
Data And Analysis: Current State Of the Field", Boston (US). Digital Commons. Vol. 1: No 2, 228-239, Article 32.
[18] Khatib O. (2004), "Non-communicable diseases: risk factors and regional
strategies for prevention and care", East Mediterr Health J 2004; 10(6):778-88.
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
78
[19] Hill AG, Muyeed AZ, Al-Lawati JA. (2000), "The mortality and health
transition in Oman: patterns and processes". Muscat (OM): World Health
Organization, Regional Office for the Eastern Mediterranean, UNICEF Oman.
[20] “Definition of Chronic disease", [Online], available at: http://www.medicinenet.com/script/main/art.asp?articlekey=33490 (accessed
07 June 2016)
[21] "Histogram", [Online], available at: http://searchsoftwarequality.techtarget.com/definition/histogram (accessed 03
October 2016)
[22] Astrid Schneider, Gerhard Hommel, and Maria Blettner (2010), "Linear
Regression Analysis", Germany; Medicine. 107(44): 776–82.
[23] Christoph Klose, Marion Pircher, Stephan Sharma (May 2004), "Univariate Time Series Forecasting", UK.
[24] Choong-Yeun Liong and Sin-Fan Foo (2013), "Comparison of Linear Discriminant Analysis and Logistic Regression for Data Classification",
Malaysia, LLC 978-0-7354-1150-0.
[25] Suzilah Ismaila, Rohaiza Zakariaa and Tuan Zalizam Tuan Mudab
(2014), "Univariate Time Series Forecasting Algorithm Validation",
Malaysia, LLC 978-0-7354-1274-3.
[26] Adela SASU (2013), "A Quantitative Comparison of Models for
Univariate Time Series Forecasting", Romania, University of Brasov, 117-
124.
[27] Kelly H. Zou, Kemal Tuncali, Stuart G. Silverman (2003), "Correlation
and Simple Linear Regression", USA, Radiology; 227:617–628.
[28] Mohammad S. Alam (2013), "Analytical Review of Data Visualization
Methods in Application to Big Data", Russia, Journal of Electrical and
Computer Engineering, Article ID 969458, 7 pages.
[29] Peter Filzmoser, Karel Hron, Clemens Reimann (2009), "Univariate
statistical analysis of environmental (compositional) data: Problems and possibilities", Austria, ScienceDirect, STOTEN-11466.
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
79
S Author year Place Method of
study Major Findings Merits Limitations
1 Hill A.G., Muyeed
A.Z., Al-Lawati J.A.
[19]
2000 Oman Numerical and
Analytical
Diabetes has become a
major chronic (or non-
communicable) disease
problem in Oman. The
prevalence of diabetes
varies among different
governorates in Oman; the
percentage for diabetes in
2000 is 13%.
The awareness about
the chronic disease has
been grown in Oman.
The percentage for
diabetes in Oman was
the highest reported in
the arab countries.
2 Khatib O. [18] 2004 Oman Analytical
The incidence of chronic
diseases is rising in the
middle east region. In 2000,
47% of the region’s load of
disease is due to Non-
communicable diseases and
it is expected that this will
rise up to 60% by the year
2020.
There are many
strategies: developing
a national strategies,
policies and plans for
prevention and care,
also implement and
enhance community
participation in
prevention and care.
The risk factors
associated with chronic
diseases are: physical
inactivity, unhealthy
diet and smoking.
3 World Health
Organization [8] 2005 Geneva Analytical and
Numerical
The number of deaths in
2005 is about (58) million,
around (35) million
approximately of deaths
resulted by chronic
diseases. It represents about
60% of global deaths which
caused by chronic diseases.
If the current trends
continue, chronic diseases
by 2030 will rise about
70% of the total deaths
globally.
Many governments
around the world spent
billions of dollars for
medication to control
and prevent the spread
of chronic diseases
Chronic diseases are a
threat and it is growing
over the time and it will
affect all countries.
4 Jawad A. Al-
Lawati, Ruth Mabry
2008 Oman Analytical The chronic diseases pose
the main challenge for
Health planners and
decision makers in
Chronic diseases are the
major exhaustion on
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
80
and Ali Jaffer
Mohammed [20] Omani population. During
the period (1990 to 2005),
approximately 75% of
diseases burden is
attributable to chronic
diseases.
Oman have greater
commitment to the
provision of services
for people with
chronic diseases.
human and financial
resources for Oman,
and this will affect the
advances in the health
care system that has
been achieved.
5 Gregory Hartl,
Menno van Hilten [1]
2012 Geneva Analytical
The chronic diseases cause
more than 60% of all deaths
in the Gulf Cooperation
Council (GCC) countries.
All the GCC countries
adopting regional strategy
to address prevent and
control of chronic diseases.
The regional strategy
in all the Gulf
Cooperation Council
(GCC) countries aims
to reduce exposure of
people to risk factors
and improving the
services to prevent and
treat these health
problems.
There are many risk
factors related to
chronic diseases, like:
unhealthy diet, the use
of tobacco and the
physical inactivity.
6 Ralph D'Agostino,
Lisa M. Sullivan
[17]
2002 Boston
(US)
Logistic Regression
Analysis, and
Survival Analysis
The model for chronic
disease consists of four
stages or phases: disease
free, pre-clinical (latent
period), clinical
manifestation, and follow-
up. Good statistical
approaches involve
hypothesizing models for
these stages, collecting
appropriate data, and then
fitting and testing the
appropriate models. The
risk factors are: age,
gender, smoking status,
blood pressure and
cholesterol.
The logistic regression
analysis can examine
the effects of risk
factors on the
development of
disease. And the
survival analysis (time
to event data) deal
with time until the
event of interest
occurs.
There are many
methodological issues
related to chronic
disease analysis, the
first one is based on
changing definitions of
risk factors and
outcomes over time.
And the second issue is
missing data.
7 Kelly H. Zou,
Kemal Tuncali,
Stuart G. Silverman
[27]
2003 USA Simple Linear
Regression
Simple linear regression
measure the linear
relationship between a
predictor variable and an
-The values of
dependent variables
can be estimated from
the observed..
- missing values is a
common problem.
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
81
outcome variable.
- It summarise the real
observations of model.
8 Christoph Klose,
Marion Pircher,
Stephan Sharma
[23]
2004 UK Autoregressive
moving average
(ARMA) linear
model
Forecasting of time-series
can be done with different
linear models, including
Autoregressive (AR),
Moving average (MA), and
ARMA model. For
building ARMA model,
there are three phases:
model identification,
estimation (or fitting) and
validation.
- The ARMA model
process is stationary.
- Also it can minimize
the number of
parameters.
- ARMA model is
suitable only for the
time series that is
stationary (for example
its mean and variance
should be constant over
time).
- It is required that there
are at least 40
observations in the
input data.
- It is also supposed that
the values of the
estimated parameters
are constant during the
series.
9 Peter Filzmoser,
Karel Hron,
Clemens Reimann
[29]
2009 Austria multivariate data
analysis
The process of statistical
data analysis starts with
looking at the data with
appropriate graphical tools.
For instance, histogram is
giving us an idea about the
data distribution.
It can estimate and
view the relationship
between different
variables.
- This method is
complex and involves
high level of
mathematical
calculations.
- The results are not
sometimes easy to
interpret.
- It needs a large
amount of data.
10 Astrid Schneider,
Gerhard Hommel,
and Maria Blettner
2010 Germany Linear Regression
Analysis
There are three types of
regression analysis: Linear
regression, Logistic
- Describe the
relationships between
dependent and
The most common
problem in medical data
analysis is missing
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
82
[22] regression, and Cox
regression. And the Linear
Regression is the most
common approach or tool
used for modelling
univariate time-series. And
this tool is used for
statistical and predictive
analysis.
independent variables.
-The values of
dependent variables
can be estimated from
the observed.
- The risk factors that
affect the outcome can
be identified.
- It summarise the real
observations of model.
values.
11 Choong-Yeun Liong
and Sin-Fan Foo
[24]
2013 Malaysia logistic regression
(LR) analysis and
Linear discriminant
analysis (LDA)
There are two methods used
for classifying data: logistic
regression (LR) and Linear
discriminant analysis
(LDA), these methods of
classifying data depends on
set of predictor variables.
The Logistic regression
(LR) is more robust than
the Linear discriminant
analysis (LDA).
- Logistic regression
(LR) method performs
better distribution of
the data.
- Linear discriminant
analysis (LDA) needs
short computing time
with large sample size.
- Logistic regression
(LR) needs long
computing time with
large sample size.
- The Linear
discriminant analysis
(LDA) will give low
performance when the
prior probability is
equal for all groups.
12 Adela SASU [26] 2013 Romania ARIMA
Linear Regression
(LR(
Multilayer
perceptrons network
(MLP)
The purpose from the
forecasting process is to
observe or model the
existing data series to
predict accurately the future
unknown data values, for
example: after five years.
The Linear regression
(LR) and Multilayer
perceptrons network
(MLP) give more
accurate predictions.
- ARMA model is
suitable only for the
time series that is
stationary.
- ARIMA gives less
accurate predictions
results.
13 Suzilah Ismaila,
Rohaiza Zakariaa
2014 Malaysia Quantitative
(Projective)
The process of forecasting
is not simple, because it
- This technique gives
valuable judgment.
- This technique require
large amount of data in
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
83
and Tuan Zalizam
Tuan Mudab [25]
forecasting
technique
requires expert and implicit
knowledge in producing
precise forecast values. For
statistical data, the
quantitative forecasting
technique is suitable.
- Also it giving more
weight to recent data.
order to formulate the
mathematical model.
14 Jason Dykes [12] 1998 UK Cartographic
visualization
Cartographic visualization
is a map based method.
And it is used by locating
symbols on a plane to show
the statistical distributions
of one or more variables.
Also local statistics and
indicators can be calculated
and visualized in a dynamic
fashion for exploration.
- Cartographic
visualization use the
locations inside map to
show data values.
- Interactive way.
- Very hard for
automatically created
dynamic maps.
15 Ben Fry [15] 2008 USA Mapping The visual can
communicate or announce
more information than the
table within a small space.
- make comparison
between different
areas.
- present the
geographic location
and the distribution of
the data.
- The size of symbol
may hide the location
on map.
- It consumes more time
to execute, depends on
size of data.
16 Adriana REVEIU,
Marian DARDALA
[16]
2011 Romania Cartographic
visualization
Cartographic visualization
provides the facilities to
represent statistical data
inside the map.
- Show the distribution
of statistical data
inside the regional
map.
- use different symbols
to show more
information at the
same time inside the
map.
- It consumes more time
to execute, when the
volume of data increase.
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
84
17 Statistics
Netherlands [13]
2012 Netherla
nds
Proportional symbol
map
Proportional symbol map is
a good way of viewing
statistical data. It aims to
view the characteristics of a
subject on the map.
- present the
geographic location
and the distribution of
the data.
- make comparison
between different
areas.
- summarise large
amounts of data.
- It is difficult to
calculate the actual
value (if not shown).
- It consumes more time
to execute.
- The size of symbol
may hide the location
on map.
18 Mohammad S.
Alam [28]
2013 Russia - Tree-map
- Circle Packing
- Sunburst
- Circular Network
Diagram
One of the most
requirements that the
analyst tools should meet is
to show more than one
view per representation
display. Circular Network
Diagram is the most
appropriate method for
visualizing the statistical
data.
- Tree-map:
hierarchical grouping
clearly shows data
relations.
- Circle Packing:
space-efficient.
- Sunburst: easily
perceptible by most
humans.
- Circular Network
Diagram: allows us to
make relative data
representation. And
inside the circle, the
resolution varies
linearly, increasing
with radial position.
- Tree-map: not suitable
for examining historical
trends and time
patterns.
- Circle packing: same
as for Tree-map
method.
- Sunburst: same as for
Tree-map method.
- Circular Network
Diagram: objects with
the smallest parameter
weight can be
suppressed by larger
ones.
19 Heinrich Hartmann
[14]
2016 USA Histograms
Line plots
One of the most essential
data analysis methods is
visualization. The
Histogram is a
popular visualization
- Histogram view
number of values
within interval.
- Histogram used only
for numerical.
- Line plot is difficult to
International Journal of Computation and Applied Sciences IJOCAAS, Volume2, Issue 2, April 2017, ISSN: 2399-4509
85
Rug plots
Scatter plot
method. It is commonly
used to show the
distribution of numerical
data.
- Line plot show
changes in data over
time.
- Scatter plot show the
relationship between
two variables.
read accurately, if there
is a wide range of data.
- Scatter plot cannot
show the relation of
more than two
variables.