an effective novel integrating big data analytics and
TRANSCRIPT
AN EFFECTIVE NOVEL INTEGRATING BIG DATA
ANALYTICS AND CLOUD FRAMEWORK FOR E-
GOVERNANCE PARADIGM WITH OPENSTACK SAHARA
ENVIRONMENT
1SK PUJITHA, G NAGA PAVANI2, P PAVAN KUMAR3, R VEERA MOHANA RAO4, KUMARARAJA
JETTI5
123ASST.PROF, VIGNAN'S LARA INSTITUTE OF TECHNOLOGY & SCIENCE, VADLAMUDI, ANDHRA PRADESH- 522213,
INDIA. 45ASST.PROF, BAPATLA ENGINEERING COLLEGE, BAPATLA, ANDHRA PRADESH 522102, INDIA
Abstract:
Information and communications technology (ICT) is an extensional term for information technology (IT) that
stresses the role of unified communications[1] and the integration of telecommunications (telephone lines and
wireless sign) and computers, just as necessary enterprise software, middleware, storage, and varying media
systems, that enable users to access, store, transmit, and manipulate information. Electronic governance or e-
governance is the utilization of information and correspondence technology (ICT) for delivering government
services, exchange of information, correspondence exchanges, integration of different stand-alone systems. The
cloud computing is a recent advancement to customary computing architectures that can overcome the
challenges related to computing while big data analytics has emerged over the past data management
technologies to solve the constraints related to data securing, storage, processing, and analytics. Therefore, the
blend of cloud computing and big data analytics would be a welcome choice in E-governance. The Sahara
project expects to provide users with a simple means to arrange data processing frameworks, (for example,
Apache Hadoop, Apache Spark, and Apache Storm) on OpenStack. This is accomplished by specifying setup
parameters, for example, the framework version, cluster topology, node hardware details and more. The
Openstack is an open-source cloud stage widely used in the vast majority of the associations because of its
distinguished features over the existing computing architectures and having worked in help for big data
analytics. The fundamental objective of this paper is to propose a framework for E-governance based on
Openstack that integrates cloud computing and big data analytics together to overcome the challenges in the
existing E-governance system.
Keywords: Information And Communications Technology, E-Governance Paradigm, Big Data Analytics,
Openstack Sahara, Cloud Computing.
I. INTRODUCTION
Electronic governance or e-governance is the utilization of information and correspondence technology
(ICT) for delivering government services, exchange of information, correspondence exchanges, integration of
different stand-alone systems between government to citizen (G2C), government-to-business (G2B),
government-to-government (G2G), government-to-employees (G2E) just as back-office processes and
interactions inside the entire government framework.[1] Through e-governance, government services are made
available to citizens in a convenient, efficient, and transparent manner. The three principle target bunches that
can be distinguished in governance concepts are government, citizens, and businesses/interest gatherings. In e-
governance, there are no particular boundaries.
The advancement in the Internet and related technologies have urged numerous businesses and governments to
select technology up-gradation of their applications running on the Web. The current E-governance isn't been
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 9
expanded sufficiently because of limited store and infrastructure that makes the effect on its usage. Therefore,
modern savvy technologies like cloud computing and big data analytics can be integrated together to overcome
the existing challenges of E-governance at a very minimal effort.
In this research paper, a framework based on Openstack is proposed for integrating cloud computing and big
data analytics for E-governance. The underlying piece of this paper explains the E-governance, cloud
computing, and big data analytics followed by Openstack. The latter part explains the proposed framework.
II. RELATED WORK
2.1 Electronic governance
Electronic governance or e-governance is the utilization of information and correspondence technology (ICT)
for delivering government services, exchange of information, correspondence exchanges, integration of different
stand-alone systems between government to citizen (G2C), government-to-business (G2B), government-to-
government (G2G), government-to-employees (G2E) just as back-office processes and interactions inside the
entire government framework.[1] Through e-governance, government services are made available to citizens in
a convenient, efficient, and transparent manner. The three primary target bunches that can be distinguished in
governance concepts are government, citizens, and businesses/interest gatherings. In e-governance, there are no
particular boundaries.
E-Government can be seen basically as moving citizen services online, however, in its broadest sense, it refers
to the technology-enabled change of government - best hope to reduce costs, while advancing economic
development, increasing transparency in government, improving service delivery and open organization, and
encouraging the advancement of an information society.[1][2][3] The principal objectives are:
2.1.1 Reducing Costs.
Putting services online considerably decreases the processing expenses of numerous activities compared with
the manual method for handling operations. Efficiency is additionally attained by streamlining internal
processes and by enabling faster and more informed decision making.
2.1.2 Promoting Economic development
Technology enables governments to create positive business climates by streamlining relationships with
businesses and reducing the administrative steps needed to consent to regulatory commitments. There is a direct
effect on the economy, as on account of e-procurement, which creates wider competition and more members in
the open sector marketplace.
2.1.3 Enhancing Transparency and Accountability
E-Government helps to increase the transparency of decision-production processes by making information
accessible - distributing government debates and minutes, budgets and expenditure statements, outcomes and
rationales for key decisions, and in some cases, permitting the on-line following of utilizations on the web by
the general population and press.
2.1..4 Improving Service Delivery:
government service delivery, in the customary process, is time-expanding, needs transparency, and leads to
citizen and business disappointment. By putting government services online, eGovernment reduces bureaucracy
and enhances the nature of services in terms of time, content and accessibility.
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 10
2.1.5 Improving Public Administration-
e-government administrative components, for example, a computerized treasury, integrated money-related
management information systems, and human resource management systems, lead to greater efficiency in an
open organization. Features include the integration of expenditure and receipt data, control of expenditure, HR
management, intelligent review through data investigation and the distributing of money related data.
2.1.6 Facilitating and e-Society
One of the fundamental benefits of an e-Governance initiative comprises of the advancement of ICT use in other
sectors. The technological and management capacities required for e-Government organizations encourage,
thus, the development of new instructional classes and modules in schools and universities attempting to supply
the required aptitudes and capabilities to the market.
2.2 Cloud Computing
Cloud computing has been defined as "a technology that relies on the transfer of processing and computer
storage space to the supposed cloud, a data center accessed by the network, in this way changing information
technology programs from items to services, which contributes to the removal of maintenance problems and
systems development of the companies used, and therefore the focal point of the efforts of beneficiaries on the
use of these services just [10].
Cloud-based integration is a type of systems integration business delivered as a cloud computing service that
addresses data, process, service-oriented architecture (SOA) and application integration.[1]
Cloud integration was essentially created to break down the data storehouses, improve connectivity and
optimize the business process. Cloud integration has increased its notoriety as the usage of Software as a Service
arrangement is developing step by step.
Cloud computing can likewise be a generic concept that includes software as a service and other modern
technology trends that share the idea of relying on the Internet to meet the computing needs of users [8].
One marketing trendy expression is iPaaS (integration stage as a service), for a suite of cloud services enabling
customers to develop, execute and govern integration streams between disparate applications.[2] Under the
cloud-based iPaaS integration model, customers drive the development and deployment of integrations without
introducing or dealing with any hardware or middleware.[1] The iPaaS enables businesses to achieve integration
without big investment into aptitudes or licensed middleware software. iPaaS used to be regarded as an
integration apparatus for cloud-based software applications, used predominantly by little to moderate-sized
businesses. Meanwhile, however, a half breed type of iPaaS,[3] i.e. Half breed IT iPaaS, that connects the cloud
to on-premises, is becoming increasingly prevalent. Moreover, large enterprises explore methods for integrating
iPaaS into the existing IT estate.
These challenges are summarized as pursues: the performance, security and protection, Control, Data transfer
expenses and Accuracy and reliability [11, 6].
2.3 Working of Big Data Analytics
In Big data, data is broken into multiple data sets and stored on different nodes of a cluster environment. One of
the nodes keeps track of the data stored at different nodes by creating an index of the data. When a request for
data search appears, the primary index is checked and the based on that required data is found and fetched. This
helps in making the search process faster even if data is excessively cumbersome and large. For example, twitter
and facebook are utilizing the concept of Big data. Data from any database management system like Oracle and
MySql can be imported into Big data.
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 11
Through data mining activities, data can be collected from different servers. Then data analytics techniques are
applied to the data retrieved and reports are generated. After that reports are analyzed to discover the outcome of
behavior in regard to a specific condition and the future activity plans are prepared.
Conceptual Model of Big Data Analytics
Fig. 2 Conceptual Model of Big Data Analytics
Fig.2 explains the conceptual architecture of Big Data Analytics. In this, data which is residing in different
arrangements from different sources and different areas are transformed utilizing different big data analytics
devices and stages. Here, different decisions are made regarding the data approach, distributed design, apparatus
selection, and analytics models. At last, four uses of data analytics i.e. queries, reports, OLAP and data mining
are generated.
2.4 E-Governance Initiatives in India
The vision of e-Governance [2] is to "Make every Public Service accessible to the regular man in his territory,
through basic service delivery outlets and ensure efficient, transparent and reliable information at negligible
expenses. The primary intent of e-Governance is to satisfy five significant objectives [3]: an approach
framework, enhanced open service, high caliber, and practical government operations, citizen engagement in
democratic processes and administrative and institutional reform. In India, the National e-Governance Plan
(NeGP) [10] has been formulated by the Department of Electronics and Information Technology (DeitY) and
Department of Administrative Reforms and Public Grievances (DARPG). As per NeGP, over 1000 e-
Governance services have been accessed through NeGP. Thee-taal (Electronic Transaction Aggregation and
Analysis layer) is the government web entry that provides insights of exchanges done electronically by citizens
with e-Governance projects. As indicated by e-taal, Indians have done over 2 billion e-exchanges in the most
recent year [1].
2.4.1 Various E-Governance projects in India include:
1) Land record Management Projects: Maintains a large number of land records and helps citizens in
giving reliable and useful information in the shortest time related to it .e.g. MEE Bhoomi (TELANGANA),
MAA BHOOMI (ANDHRA Pradesh), Land records Management system State government of Devbhoomi(
Uttarakhand).
2) Local Information Projects: Citizens can search for neighborhood information, for example, credit
rates, prices of seeds, fertilizers, etc utilizing the accompanying E-Governance services like E-JanSampark
(Chandigarh), Prajavani (Andra Pradesh, E-Samadhan (Himachal Pradesh).
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 12
3) Agriculture: coming up next are the projects used in agriculture GYANDOOT (Madhya Pradesh),
AGMARKNET and SEEDNET (Department of Marketing and Inspection, Ministry of Agriculture and
Government of India).
4) Disaster Management: Managing Disasters is a very challenging activity for the government as it is
regular and unpredictable. To handle such disasters, state governments have started e-governance services like
Project Chetana started by the state government of Bihar to deal with cataclysmic events like floods and
earthquakes.
5) Services provided by this e-Governance service are the Issue of Birth/Death certificate, country water
supply, and sanitation, leading different Welfare schemes for poor people and needy people.
6) The biggest utilization of e-Governance in India is UID Aadhaar. It is one of the prestigious projects of
the Indian government where a Biometric Card with Unique Identification Number is issued to every citizen.
UID Aadhaar is a unique identification project undertaken by the Unique Identification Authority of India
(UIDAI) established in 2009. The authority keeps up a database of residents containing the biometric and other
data. Every one of these numbers is stored in a centralized database and linked to the fundamental demographic
and biometric information, for example, a photo, ten fingerprints, and iris of each person. In India, this is the
biggest source of Big data. Starting on 15 October 2019, over 100.5 crores Aadhaar numbers have been issued
in the project [6].
2.4.2 Benefits of utilizing Big Data Analytics for E-Governance Projects
The significant benefits of presenting Big data stage in e-Governance projects are to facilitate government.
These benefits are:-
• Improvement in online information and service delivery by the government for Business Analytics.
• Making processes open and transparent to users.
• Recording and Visualizing government performance through predictive policing.
• Producing an understanding of new business projects and improving business development.
• Providing better Customer Services.
2.4.3 Challenges of Using Big Data Analytics with E-Governance
In spite of the fact that examining Big Data utilizing different scientific techniques has proved to be useful for e-
Governance projects yet at the same time, there are some challenges that need to be handled to achieve desirable
results. Some of them are:-
• Privacy: Big data analytics refers to examining verifiable data. Each e-Governance project comprises
of a great deal of private information related to different users. This data can be regarding UIDs of people,
sales/purchase information of a business firm, information of clients of an association or records of patients of
the emergency clinic. Such associations may oppose in revealing their private information. Whenever provided,
such information ought to be preserved properly with the goal that it is safe and beneficial.
• Security: Big data generated from e-Governance projects is a mix of large data that is stored on
different servers. Storage and security of this big data are some of the significant challenges. Processing of this
data utilizing data analytics further produces more data as reports. Dealing with this data and protecting it from
unauthorized access and usage ought to be carefully done.
• Shortage of qualified and technical people to handle and implement such techniques: Still it is very
hard to discover skilled and qualified people who are proficient in utilizing big data analytics techniques with e-
Governance projects.
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 13
Inaccessibility and Under-development of relevant software instruments, integration of multiple data sources
and organizations: Data generated by different e-Governance projects during an investigation are of different
types and different configurations. Sometimes the use of relevant software instruments isn't possible because of
its inaccessibility or that technique/software is still under development.
Lack of regular standards for information/data representation and interoperability: No proper standards are
defined of how data is to be represented utilizing investigative techniques.
2.5 OpenStack SAHARA
The Sahara project expects to provide users with a simple means to arrange data processing frameworks, (for
example, Apache Hadoop, Apache Spark, and Apache Storm) on OpenStack. This is accomplished by
specifying design parameters, for example, the framework version, cluster topology, node hardware details and
more.
Apache Hadoop is an industry-standard and widely adopted MapReduce implementation, it is one of a
developing number of data processing frameworks. The point of this project is to enable users to easily
arrangement and manage clusters with Hadoop and other data processing frameworks on OpenStack. It merits
mentioning that Amazon has provided Hadoop for several years as Amazon Elastic MapReduce (EMR) service.
Sahara intends to provide users with a simple means to arrangement Hadoop, Spark, and Storm clusters by
specifying several parameters, for example, the framework version, cluster topology, hardware node details and
more. After a user fills in every one of the parameters, Sahara deploys the cluster in a few minutes. Likewise,
Sahara provides a means to scale an already provisioned cluster by including or removing worker nodes'
demand.
III. PROPOSAL WORK
The proposed model/framework is for the most part deployed on Openstack because of its unique advantages
over other cloud stages.
The Sahara item communicates with the accompanying OpenStack services:
• Dashboard (skyline) - provides a GUI with the capacity to use the entirety of Sahara's features.
• Identity (keystone) - authenticates users and provides security tokens that are used to work with
OpenStack, constraining a user's abilities in Sahara to their OpenStack privileges.
• Compute (nova) - used to arrangement VMs for data processing clusters.
• Bare metal (amusing) - used to arrangement Baremetal nodes for data processing clusters.
• Orchestration (heat) - used to arrange and orchestrate the deployment of data processing clusters.
• Image (glance) - stores VM images, each image containing an operating system and a pre-installed data
processing dissemination or framework.
• Object Storage (quick) - can be used as storage for work binaries and data that will be processed or
created by framework occupations.
• Block Storage (cinder) - can be used to arrangement square storage for VM instances.
• Networking (neutron) - provides networking services to data processing clusters.
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 14
• DNS service (designate) - provides the capacity to communicate with cluster instances and Hadoop
services by their hostnames.
• Telemetry (ceilometer) - used to collect measures of cluster usage for metering and observing purposes.
• Shared file systems (manila) - can be used for the storage of framework work binaries and data that will be processed or created by occupations.
• Key manager (barbican and castellan) - persists the authentication data like passwords and private keys
in secure storage.
It has four layers that describe the usefulness of integrated cloud-enabled big data for E-governance. Each layer
in a model has a predefined work that provides services to its upper layer.
Fig. 1 Proposed model for E-governance using Openstack
In the proposed model, the main layer is the infrastructure layer which is responsible for giving the compute,
storage, network, and other hardware resources to the cloud layer for building the cloud resource pool. The
second layer is the Openstack cloud layer that creates and runs VM instances just as provides storage and
network services over the Openstack infrastructure. This layer is additionally responsible for running different
E-governance applications by deploying Web servers, application servers, mail servers, or database servers, etc.,
as a VM instance over the cloud infrastructure. The third layer is a big data analytics layer that is controlled by
Sahara components of Openstack that permits creating distributed virtual clusters to run different Hadoop
conveyances like Hadoop, Hortonworks, Storm, and Cloudera.
Integration with Object Storage
The quick project provides the standard Object Storage service for OpenStack environments; it is simple for
the Amazon S3 service. As a rule, it is deployed on bare metal machines. It is normal to expect data processing
on OpenStack to access data stored there. Sahara provides this choice with a file system implementation for
quick HADOOP-8545 and Change I6b1ba25b which implements the capacity to list endpoints for an object,
record or container. This makes it possible to integrate quickly with software that relies on data area information
to stay away from network overhead.
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 15
To get more information on the best way to enable quick help to see Swift Integration.
Pluggable Deployment and Monitoring
Notwithstanding the observing capabilities provided by vendor-specific Hadoop management tooling, Sahara
provides pluggable integration with external checking systems, for example, Nagios or Zabbix.
Both deployment and checking devices can be installed on standalone VMs, in this way enabling a single
instance to manage and screen several clusters at once.
The multiple VMs from the cloud layer is used to create virtual clusters for Hadoop that process the big data.
At long last, the E-governance layer is responsible for giving access to E-governance services to their different
stakeholders like citizens, businesses, and other government departments. The skyline component from cloud
layer expert vides administrator access to the E-governance applications deployed on the Openstack cloud stage.
Along these lines, the proposed model meets the challenges of E-governance by integrating cloud and big
data services together over the Openstack cloud stage. The cloud layer provide benefits like adaptability of
resources, disaster recovery, portability, high accessibility, faster performance, cost efficiency, automated
reinforcement and recovery, while big data layer provides advantages like the versatility of databases, storage of
multiformat data, flexibility, adaptation to non-critical failure, faster processing, advanced data analytics, and
cost-efficiency. The proposed framework can rush to every one of the utilizations of E-governance where huge
computing and data processing power is required. The best example of such an application is Aadhar
identification services by the Indian Government which collects and processes a huge measure of citizen's
identity data including images, text, thumbprint, and other textual data.
Conclusion
This research paper has carried out a relevant report on cloud computing and big data analytics for E-
governance. It has contributed by proposing a new model that demonstrates the down to earth approach toward
cloud and big data-enabled E-governance utilizing the Openstack cloud stage. Big data analytics is changing the
manner in which the government is utilizing sophisticated information technologies to pick up understanding
from their data repositories to make informed decisions. This data-driven methodology is exceptional, as the
data collected through the web and internet-based life is escalating continuously. In the future, we'll see the
quick, widespread implementation and use of big data analytics over the government and the business. As it
becomes more mainstream, issues, for example, guaranteeing protection, safeguarding security, establishing
standards and governance, and persistently improving the devices and technologies would gather attention. The
proposed model adhere to the benefits of cloud computing like adaptability, elasticity, disaster recovery, cost
efficiency, etc., while the benefits of integrated big data analytics would be leveraging the scalable vol-umes,
bolsters uncommon velocity of approaching data, putting away variety of data in different arrangements, easier
data processing, and faster data analytics.
References
1. Jadhav B., Patankar A.B. (2019) A Framework for Integrating Cloud Computing
and Big Data Analytics into E-Governance Using Openstack Sahara. In:
Satapathy S., Joshi A. (eds) Information and Communication Technology for
Intelligent Systems. Smart Innovation, Systems and Technologies, vol 107.
Springer, Singapore
2. C. S. A. E. Abdullah Alhammadi, "The Determinants OfCloud Computing Adoption In Saudi Arabia,"
Computer Science & Information Technology (CS & IT), Vols.David C. Wyld et al. (Eds): CSEN,
AISO, NCWC, no. DOI: 10.5121/csit.2015.51406, p. pp. 55–67, 2015.
3. Ahmad Mosa, "Cloud Computing In E-Government: a survey," International Journal Of Advanced
Research In Computer Science & Technology ), vol. Vol. 3, no. Issue 2, Ijarcst 2015.
4. G. H. G.-E. Akram Ali Othman Nasr, "ProposedDevelopment Model Of E-Government To
Appropriate Cloud Computing," International Journal of Reviews in Computing, vol. Vol. 9, 2012
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 16
5. Dash, S., Pani, S.K.: E-Governance paradigm using cloud infrastructure: benefits and challenges. In:
International Conference on Computational Modeling and Security, pp. 843–855. Elsevier (2016)
6. Smitha, K.K., Thomas, T., Chitaranjan, K.: Cloud-based E-Governance system: a survey. In:
International Conference on Modeling, Optimization, and Computing, pp. 3816–3823. Elsevier (2012).
7. Kurikala, Geetha, K. Gurnadha Gupta, and A. Swapna. "Fog computing: Implementation of
security and privacy to a comprehensive approach for avoiding knowledge thieving attack
exploitation decoy technology." International Journal of Scientific Research in Computer Science,
Engineering and Information Technology 2.4 (2017): 176-181.
8. Sandeep, Achi, and K. Gurnadha Guptha. "Compressive Study on Data Mining Methods in Cloud
Computing." International Journal of Scientific Research in Computer Science, Engineering and
Information Technology (IJSRCSEIT), ISSN (2017): 2456-3307.
9. Tao, D., Lin, Z., Wang, B.: Load feedback-based resource scheduling and dynamic migration-based
data locality for virtual Hadoop clusters in OpenStack-based clouds. Tsinghua Science and
Technology. IEEE (2017)
10. Anantharam, Banoth, and Gurnadha Guptha. "Security Issues in Various Cloud Computing:
Solution and Occur-Rent Solutions." International Journal of Scientific Research in Computer
Science, Engineering and Information Technology (IJSRCSEIT), ISSN (2016): 2456-3307.
11. Sharma, R., Kanungo, P.: An intelligent cloud computing architecture supporting E-Governance. In:
ICAC, pp. 1–5. IEEE (2011)Charalabidis, Y., Koussouris, S., et al.: A cloud infrastructure for
collaborative digital public services. In: ICCTS, pp. 342–347. IEEE (2011)
Azania Journal ISSN NO : 1945-5534
Volume 21, Issue 2 February - 2021 Page No : 17