dessertation final report

102
Data Security in Cloud Computing Mr .ANIL KUMAR MYSA 07053456 [email protected] Supervisor : NICHOLAS IOANNIDES [email protected] A Dissertation submitted in partial fulfillment

Upload: lakshman223

Post on 03-Apr-2015

364 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: DESSERTATION FINAL REPORT

Data Security in Cloud Computing

Mr .ANIL KUMAR MYSA

07053456

[email protected]

Supervisor : NICHOLAS IOANNIDES

[email protected]

A Dissertation submitted in partial fulfillment

of the requirements of London Metropolitan University

for the degree of Master of Science (MSc) in Computer Networking

Faculty of Computing

January 2010

Page 2: DESSERTATION FINAL REPORT

Abstract

Cloud computing has become a significant technology trend and many experts expect

it to reshape information technology processes and the IT market place in next few

years. Security and data security becomes more important when using clod computing

at all levels like Infrastructure as a service (IaaS) Platform as a Service (PaaS) and

Software as service (SaaS). A major reason for the lack of effective data security is

the simply the limitations of current encryption capabilities. In this paper a novel

techniques were discussed like Key derivation method, homomorphic tokens and

achieve stronger data security storage correctness, fast localization of data error. This

is followed by a discussion of the performance evaluation and an outlook into future.

2

Page 3: DESSERTATION FINAL REPORT

Summary

Cloud computing is not an innovation but a means to constructing IT services that use

advanced computational power and improved storage capabilities it has drawn the

attention of major industrial companies, scientific communities as well as user groups.

Critics argue that cloud computing is not secure enough data leaves companies local

area networks. Encryption is a well known approach to addressing these types

security threats for protection in the cloud, the enterprise would need to encrypt all

data and communications.

Approaching an effective, flexible distributed scheme with explicit dynamic data

support to ensure the correctness of user’s data in the cloud by utilizing homomorphic

token system with distributed verification of erasure-coded data, and working on

integration of storage correctness insurance and data error localization.

Further to find secure and efficient dynamic operations on data blocks, including data

update, delete and append. Overall to design efficient mechanism for dynamic data

verification and operation to achieve storage correctness, fast localization of data

error, dynamic data support, minimizing the effect brought by data errors or secure

failures.

Reviewed on basic tools of coding like homomorphic tokens that in needed for file

distribution across cloud servers, analysis are studied from the proposed approaches

for computational, storage and communication overhead of a data access operation

and to prevent revoked users from getting access to outsourced data through

eavesdropping, efficient key management methods like key derivation hierarchy

are delivered.

3

Page 4: DESSERTATION FINAL REPORT

Contents Page.no

Chapter1 Introduction 11

1.1 Statement of Problem 11

1.2 Aims and Objectives 13

1.3 Approach and Methodology 15

1.4 Chapter Preview 16

Chapter 2.0 Literature review 17

2.1 Cloud Computing 18

Definitions

Models

Levels

Cloud Storage

2.2 Cloud Computing Security Issues 19

2.2a Confidentiality

2.2b Authentication

2.2c Authorization

2.2d Integrity

2.5e Availability

2.3 Technical Security Issues 19

2.4Requirements of Data Security in Cloud Computing 22

2.5 Infrastructure Security

Network Level

Host Level

4

Page 5: DESSERTATION FINAL REPORT

Application Level

Infrastructure Responsibilities and Challenges

Chapter 3.0 First Approach 23

Confidentiality 24

3.1 Encryption Techniques 26

3.2 Proof Of Retrievability 28

3.3 29

3.4 31

3.5 Conclusion 32

3.6 Summary 32

Chapter 4.0 Second Approach 33

Integrity

Message Authentication Code

Data Verifiability 33

4.1 Conclusion 38

4.2 Summary 38

Chapter 5.0 Third Approach 39

Availabiltiy 39

5.1 Major Threats 41

5.2 Network Based Attacks 42

5.3 CSP’s Own Availabiltity 44

5.4 45

5

Page 6: DESSERTATION FINAL REPORT

5.4.1 45

5.4.2 45

5.4.3 45

5.5 Conclusion 47

5.6 Summary 48

Chapter6.0 Critical Appraisal, Recommendations and Future work 49

6.1 Future Work 51

Chapter 7 Conclusions 52

Appendix A Scientific Article 53

Appendix B Project Proposal

References and Bibliography

List of Figures page.no

Fig1. Cloud Data Storage Architecture 18

Fig2.Key Derivation Hierarchy 25

Fig3. Handling Updates to Data Blocks 30

Fig4. HDFS Architecture 33

Fig5. Cloud Computing Data Security mode 35

6

Page 7: DESSERTATION FINAL REPORT

Definition of Terms

HDFS Architecture The Hadoop Distributed File System (HDFS) is a

distributed file system designed to run on commodity hardware. It has

many similarities with existing distributed file systems. However, the

differences from other distributed file systems are significant. HDFS is

highly fault-tolerant and is designed to be deployed on low-cost

hardware. HDFS provides high throughput access to application data

and is suitable for applications that have large data sets. HDFS

7

Page 8: DESSERTATION FINAL REPORT

relaxes a few POSIX requirements to enable streaming access to file

system data. HDFS was originally built as infrastructure for the

Apache Nutch web search engine project. HDFS is part of the Apache

Hadoop Core project.[1]

Reed–Solomon error correction is an error correcting code that works by

oversampling a polynomial constructed from the data. The polynomial is evaluated at

several points, and these values are sent or recorded. Sampling the polynomial more

often than is necessary makes the polynomial over-determined. As long as it receives

"many" of the points correctly, the receiver can recover the original polynomial even

in the presence of a "few" bad points.[2]

Universal hashing is a randomized algorithm for selecting a hash function F with the

following property: for any two distinct inputs x and y, the probability that F(x)=F(y)

(i.e., that there is a hash collision between x and y) is the same as if F was a random

function. Thus, if F has function values in a range of size r, the probability of any

particular hash collision should be at most 1/r. There are universal hashing methods

that give a function F that can be evaluated in a handful of computer instructions.[3]

Homomorphic encryption is a form of encryption where one can perform a specific

algebraic operation on the plaintext by performing a (possibly different) algebraic

operation on the ciphertext. Depending on one's viewpoint, this can be seen as a

positive or negative attribute of the cryptosystem. Homomorphic encryption schemes

are malleable by design and are thus unsuited for secure data transmission. On the

other hand, the homomorphic property of various cryptosystems can be used to create

8

Page 9: DESSERTATION FINAL REPORT

secure voting systems, collision-resistant hash functions and private information

retrieval schemes.[4]

Byzantine fault is an arbitrary fault that occurs during the execution of an algorithm

by a distributed system. It encompasses those faults that are commonly referred to as

"crash failures" and "send and omission failures". When a Byzantine failure has

occurred, the system may respond in any unpredictable way, unless it is designed to

have Byzantine fault tolerance.

These arbitrary failures may be loosely categorized as follows:

a failure to take another step in the algorithm, also known as a crash failure;

a failure to correctly execute a step of the algorithm; and

arbitrary execution of a step other than the one indicated by the algorithm.[5]

Glossary

ACM Access Control Matrix

MAC Message Authentication Code

9

Page 10: DESSERTATION FINAL REPORT

HDFS Hoop Distributed File System

SAAS Software as a Service

PAAS Platform as a Service

IAAS Infrastructure as a Service

SLA Service Legal Agreement

SOA Service Oriented Architecture

ACKNOWLEDGMENT

10

Page 11: DESSERTATION FINAL REPORT

To my Supervisor- Dr. Nicholas Ioannides. Thank you for your insight, patience,

Encouragement and guidance, for your constant willingness to answer the frequent

qestions. I had during the course of my research, for your continuing support which

has helped me enthusiastically compile this work and for your added humour which

left a smile on my face and enabled me work with light-heartedness. I hope that all

Your future endeavours will be filled with affluence. I am indeed honoured and

Privileged to call myself your student.

To my parents Mum and Dad, your prayers worked .Thanks a lot

Chapter 1

1.1 Statement of Problem

11

Page 12: DESSERTATION FINAL REPORT

Cloud computing is not an innovation but a means to

constructing IT services that use advanced computational power and improved storage

capabilities it has drawn the attention of major industrial companies, scientific

communities as well as user groups. The main focus of cloud computing from the

providers view as extraneous hardware connected to support downtime on any device

in the network.

Critics argue that cloud computing is not secure enough data leaves companies local

area networks. It is up to the clients to decide the vendors, depending on how willing

they are to implement secure policies and be subject to third party verifications. [6]

Encryption is a well known approach to addressing these types security threats for

protection in the cloud, the enterprise would need to encrypt all data and

communications while it is not that difficult to add encryption software initially to the

application environment, the new configuration requires ongoing management and

maintenance. And in order to run the application in the cloud, the enterprise needs to

deliver the encryption keys to the cloud to decrypt the data, creating additional

security risks by exposing the keys in the operating environment. [7]

Recent advances in cryptography could mean that future cloud computing services

will not only be able to encrypt documents to keep them safe in cloud but also make it

possible to search and retrieve this information without first decrypting it.

Encrypted search architectures and tools have been developed by groups at several

universities and companies. Though there are a variety of different approaches, most

technologies encrypt data file as well as tags called metadata that describe the

12

Page 13: DESSERTATION FINAL REPORT

contents of those files and issue a master key to the user. The token used to search

through encrypted data contains functions that are able to find matches to metadata

attached to certain files and then return the encrypted files to the user. Once the user

has the file, he can use his master decryption “key” to decrypt it. [8]

Firstly, traditional cryptographic primitives for the purpose of data security protection

can not be directly adopted due to the users’ loss control of data under cloud

Computing. Therefore, verification of correct data storage in the cloud must be

conducted without explicit knowledge of the whole data. Considering various kinds of

data for each user stored in the cloud and the demand of long term continuous

assurance of their data safety, the problem of verifying correctness of data storage in

the cloud becomes even more challenging. Secondly, Cloud Computing is not just

a third party data warehouse. The data stored in the cloud may be frequently updated

by the users, including insertion, deletion, modification, appending, reordering, etc.

To ensure storage correctness under dynamic data update is hence of paramount

importance. However, this dynamic feature also makes traditional integrity insurance

techniques futile and entails new solutions. Last but not the least, the deployment

of Cloud Computing is powered by data centres running in a simultaneous,

cooperated and distributed manner. Individual user’s data is redundantly stored in

multiple physical locations to further reduce the data integrity threats. Therefore,

distributed protocols for storage correctness assurance will be of most importance in

achieving a robust and secure cloud data storage system in the real world. However,

more research efforts are needed to achieve flexible access control to large scale

dynamic data. [9]

13

Page 14: DESSERTATION FINAL REPORT

1.2 Aims and Objectives:

The Aim of the project is to Study and Evaluate on major security concerns like

Confidentiality ,Integrity, and Availability in cloud computing to achieve Secure

Data in Infrastructure as a Service(IaaS) Model.

Objectives:

Academic Objectives: This project will introduce an effective and flexible distributed

scheme with two salient features by utilizing homomophic token with distributed

verification of erasure-coded data to achieve the integration of storage correctness

insurance and data error localization and designing a data security model.

To determine the efficient key management method for data

block encryption.

To prevent revoked users from getting access to outsourced data through

eavesdropping.

To analyse the computational, storage and communication overhead of a data access

operation.

To review of basic tools from coding theory that is needed for file distribution across

cloud servers.

Personal Objectives:

14

Page 15: DESSERTATION FINAL REPORT

To gain sound knowledge on cryptographic tools and encryption key methods

dynamic data storage.

To understand the issues and problems associated with data storage in cloud

computing.

To get information on cloud security issues from journals, conference papers and IT

magazines.

To know the security issues technical and not technical in cloud computing.

1.3 Approach and Methodology:

To develop a new scheme which integrates several advanced techniques to

secure and efficient access to large scale outsourced data in cloud computing by

encrypting every data block with a different symmetric key and adopting key

derivation method and to provide fine grained access control to the outsourced data

with flexible and efficient management by reducing the risk of owner where user

needs only few secrets for key derivation and it does not need to access the storage

server except for data updates and to construct the key hierarchies key derivation

procedures and mechanisms to handle dynamics in outsourced data blocks.

Approaching an effective, flexible distributed scheme with explicit dynamic data

support to ensure the correctness of user’s data in the cloud by utilizing

15

Page 16: DESSERTATION FINAL REPORT

homomorphic token system with distributed verification of erasure-coded data, and

working on integration of storage correctness insurance and data error localization.

Further to find secure and efficient dynamic operations on data blocks, including data

update, delete and append. Overall to design efficient mechanism for dynamic data

verification and operation to achieve storage correctness, fast localization of data

error, dynamic data support, minimizing the effect brought by data errors or secure

failures.

To review basic tools from coding theory which are needed for file distribution and

homomorphic token system which belongs to universal hash function to preserve

homomorphic properties and to perfectly integrate with verification of erasure-coded

data and also to derive challenge response protocol to verify the storage correctness

as well as to identify misbehaving servers.[9]

1.4 Chapter Preview: The following chapter provides an overview of the project. In

this chapter issues such as project overview, problem of statement, contribution and

report outline are discussed.

Chapter 1 discusses the introduction of the project. Chapter 2 explores the background

and literature review. Chapter 3 is the First approach explains hot to secure and

efficient access to outsourced data. Chapter 4 gives the security model for cloud

computing. Chapter 5 gives to ensure security storage security in cloud computing.

Chapter 6 gives critical appraisal and future work. Chapter 7 shows the conclusion.

16

Page 17: DESSERTATION FINAL REPORT

Chapter 2 Literature Review

2.1Cloud Computing

The concept of cloud computing addresses the next evolutionary step of distributed

computing. The goal of this computing model is to make a better use of distributed

resources, put them together in order to achieve higher throughput and be able to

tackle large scale computation problems. Cloud computing is not a completely new

concept for the development and operation of web applications. It allows for the most

cost effective development of scalable web portals on highly available and fail –safe

infrastructures. The evolution of cloud computing can handle such massive data as per

on demand service [10].

There are three categories of cloud computing:

Software as a service (SaaS): is software offered by a third party provider, available

on demand, usually via the Internet configurable remotely. Examples include online

17

Page 18: DESSERTATION FINAL REPORT

word processing and spreadsheet tools, CRM services and web content delivery

services (Salesforce CRM, Google Docs, etc)

Platform as a service (PaaS): allows customers to develop new applications using

APIs deployed and configurable remotely. The platforms offered include development

tools, configuration management, and deployment platforms. Examples are Microsoft

Azure, Force and Google App engine.

Infrastructure as service (IaaS): provides virtual machines and other abstracted

hardware and operating systems which may be controlled through a service API.

Examples include Amazon EC2 and S3, Terremark Enterprise Cloud, Windows Live

Skydrive and Rackspace Cloud.

Clouds may also be divided into

Public: available publicly - any organisation may subscribe

Private: services built according to cloud computing principles, but accessible only

within a private network

Partner: cloud services offered by a provider to a limited and well-defined number of

parties. [9 ][11].

18

Page 19: DESSERTATION FINAL REPORT

Cloud data storage architecture

2.2 Cloud computing security Issues

Privileged user access-Information transmitted from the client through the Internet

poses a certain degree of risk, because of issues of data ownership; enterprises should

spend time getting to know their providers and their regulation as much possible

before assigning some trivial applications first to test the water.[6]

Regulatory Compliance-clients are accountable for the security of their solution, as

they can choose between providers that allow auditing by third party organizations

that check levels of security and providers that do not.

Service Level Agreement(SLA) which vendor has to provide assurance to convince

customer on security issues in cloud computing performance management to be

19

Page 20: DESSERTATION FINAL REPORT

reviewed regularly by two parties to minimize and resolve unplanned incidents

customer duties and responsibilities service qualities third party claims for breaches

exclusions adequate provisions for disaster recovery and business continuity planning

to protect the service.[12]

2.3 Technical security Issues:

The current browser based authentication protocols for the cloud are not secure as the

browser is unable to issue XML based token by itself and federated identity

management system store security tokens with in the browser, where they are only

protected by integrating TLS and SOP in better fashion.

A promising countermeasure approach by performing a service instance integrity

check prior to using a service instance for incoming requests in the cloud system

prevents from the cloud malware injection attack, where as for metadata spoofing

attack a hash based integrity verification of the metadata description files prior to

usage is required by strengthening the security capabilities and integrating the security

web service frameworks into the web browser cloud computing security can be

improved.[13]

Data location-depending on contracts, some clients might never know what country or

what jurisdiction their data is located.

Data segregation-encrypted information from multiple companies may be stored on

the same hard disk, so a mechanism to separate data should be deployed by the

provider.

Recovery-every provider should have a disaster recovery protocol user data.

20

Page 21: DESSERTATION FINAL REPORT

Investigative support-if a client suspects faulty activity from the provider, it may not

have many legal ways pursue an investigation

Long term viability-refers to the ability to retract a contract and all data if the current

provider is bought out by another firm [6]

Security, data security become more important when using cloud computing at all

levels like Infrastructure as a Service (IaaS) Platform as a Service(Paas) and Software

as a Service (SaaS). A major reason for the lack of effective data security is simply

the limitations of current encryption capabilities, however effect adequately detail

data lineage (mapping) are simply not possible in today’s cloud computing offerings

and another major concern is about data residue left behind and possibly becoming

available to unauthorized parties.

These concerns with data security do not negate the capabilities of advantage of

utilizing storage as service in the cloud for non sensitive, on regulated data. If

customers do want to store organizational data in the cloud they must take explicit

action or at least verify that the provider will and can adequately provide their data

stored in the cloud.

We know how to effectively encrypt data in transit and we know how to effectively

encrypt data at rest but because encrypted data cannot be processed or stored to do

any of those important activities requires that the data be encrypted hence a security

concern especially if that data is in the cloud and is beyond the data owners direct

control.

Even efforts to effectively manage data that is encrypted are extremely complex and

troublesome due to the current inadequate capabilities of the key management

21

Page 22: DESSERTATION FINAL REPORT

products. Key management is an intra organizational content is difficult enough trying

to do effectively key management is the cloud is frankly beyond current capabilities

and will require significant advances in both encryption and key management

capabilities to be viable claim of key management products being effective currently

are naïve at best.

Due to nature of cloud computing like multitenancy and the volume of data likely to

be put in the cloud data security capabilities are important for the future of cloud

computing because of that coupled with today’s inadequate encryption and key

management capabilities cryptographic research efforts, such as predicate encryption

are underway to limit the amount of data that can be decrypted for processing in the

cloud. Recently announced capabilities of fully homomorphic encryption to process

encrypted data should be a huge benefit to the cloud computing. Similarly research

into large scale multitenancy key management should also be encouraged as it would

be of enormous benefit to cloud computing. [13]

2.4 Requirements of Data security in Cloud Computing

Secure data storage management is an important aspect of quality of service; cloud

computing inevitably poses new challenging security threats for number of reasons.

Traditional cryptographic primitives for the purpose of data security protection can

not be directly adopted due to the user’s loss control of data under cloud computing.

In the Security Alliance Guidance a secure outsourcing service should be evaluated

from at least from (1) strong encryption and scalable management (2) user

provisioning, deprovisioning (3) system availability and performance.

Securing outsourced data for multi-user accesses can achieved both data and metadata

must be properly protected.

22

Page 23: DESSERTATION FINAL REPORT

Therefore verification of correct data storage in the cloud must be conducted without

explicit knowledge of the whole data, considering various kinds of data for each user

stored in the cloud and the demand of long term continuous assurance of their data

safety, the problem of verifying correctness of data storage in the cloud becomes more

challenging .Secondly cloud computing is not just a third party data warehouse. The

data stored in the cloud may be frequently updated by the users, including insertion,

deletion, modification, appending, recording etc. To ensure storage correctness under

dynamic feature also makes traditional integrity insurance techniques futile and

entails new solutions. Last but not least, the deployment of cloud computing is

powered by data centres running in a simultaneous, cooperated and distributed

manner. Individual user’s data is redundantly stored in multiple physical locations to

further reduce the data integrity threats. Therefore distributed protocols for storage

correctness assurance will be of most importance in achieving a robust and secure

cloud data.

There were many approaches adopted asymmetric encryption to information data

block level will make the key management mechanism of secure file systems very

cumbersome. These techniques while can be useful to ensure the storage correctness

without having users possessing data, can not address all the security threats in cloud

data storage, since all focussed on single server scenario and did not considered the

dynamic operations. Distributed protocols were proposed for ensuring storage

correctness across multiple servers on peers. Again none of these distributed schemes

was aware of dynamic data operations and as a result their applicability in cloud data

storage can be drastically limited [9][4][15].

23

Page 24: DESSERTATION FINAL REPORT

Chapter 3

3.0 Title: Secure and Efficient Access to Outsourced Data

To enable secure and efficient access to outsourced data, investigators have tried to

integrate key derivation mechanisms [16, 17, 18,19] h encryption-based data access

control.[20 propose a generic method that uses only hash functions to derive a

descendant's key in a hierarchy. The method can handle updates locally and avoid

propagation. Although the proposed key derivation tree structure can be viewed as a

special case of access hierarchies, from analysis it is shown that the proposed method

serves the studied application better. In [21], the authors divide users into groups

based on their access rights to the data. The users are then organized into a hierarchy

and further transformed to a tree structure to reduce the number of encryption keys.

This method also helps to reduce the number of keys that are given to each user

during the initiation procedure. In [22], data records are organized into groups based

24

Page 25: DESSERTATION FINAL REPORT

on the users that can access them. Since the data in the same group are encrypted by

the same key, changes to user access rights will lead to updates in data organization.

While a creative idea in this approach is to allow servers to conduct a second level

encryption (over-encryption) to control access, repeated access revocation and grant

may lead to a very complicated hierarchy structure for key management. In [23], the

approach will store multiple copies of the same data record encrypted by different

keys. At the same time, when access rights change, reencryption and data updates to

the server must be conducted. These operations will cause extra overhead on the

server and do not fit into proposed approach. An experimental evaluation of these

approaches can be found in [24].

The basic idea is to generate the data block encryption keys through a hierarchy.

Every key in the hierarchy can be derived by combining its parent node and some

public information. As the derivation procedure uses a one-way function, secret keys

of the parent node and sibling nodes cannot be calculated, In this way the data owner

will need to maintain only the root nodes of the hierarchy and during the key

distribution procedure the owner can send the secrets in the hierarchy to end users

based on their access rights. The end users will derive the leaf nodes in the hierarchy

to decrypt the data blocks.

25

Page 26: DESSERTATION FINAL REPORT

Key Derivation Hierarchy

By assuming that the out sourced data contain n blocks and 2 P-1 <=n<2P a binary tree

structure with height p can be built. The data owner will choose a root secret K 0,1

which is the first index of key represents its level in the hierarchy and second index

represents its sequence in the level and so on. The data owner chooses a public hash

function h( ) for any node K ij in the hierarchy its child can be calculated as

by sandwiching the sequence number of the child

node with the parents key and then hash function is applied. Similarly right child of

Kij can be calculated, A node can calculate the secrets of all its descendants applying

this function repeatedly, reaching level P of the hierarchy the hash results can be used

as keys to encrypt the data blocks.

3.1 Data Access Procedure

To prevent revoked users from getting access to outsourced data through

eavesdropping. The service provider will conduct over encryption when it sends data

26

Page 27: DESSERTATION FINAL REPORT

blocks to end users, service provider and end users share a pseudo random bit

sequence generator p( ) .

Representing :

O data owner,

S service provider,

U end user.

Since only U and O know Kou, O will be able to authenticate the sender .The request

index will be increased by 1 every time U send out a request and it is used by O to

defend against replay attacks. The request contains the index numbers of the data

blocks that U wants to access. The message Authentication Code (MAC) will protect

the integrity of the packet.

When O receives this message, it will authenticate the sender and verify the integrity

and it will then examine its access control matrix and make sure that U is authorised

to read all blocks in the request. If the request passes this check the owner will

determine the smallest set of keys K` in the key such that (1) K’ can derive the keys

that are used to encrypt the requested data blocks.(2)U is authorized to know all keys

that can be derived from K’and can be determined by an algorithm.

The owner will then generate the reply to the end user.

27

Page 28: DESSERTATION FINAL REPORT

The ACM index is used by O to lable the freshness of the Access Control

Matrix(ACM).This index will be increased by 1 every time O changes some end users

access rights. The updated ACM index will be sent to S by O to prevent those revoked

users from using old certificates to access data blocks. The seed is a random number

to initiate P( ) so that U can decrypt the over encryption conducted by S. U will use K`

to derive the data block encryption keys. The format of certificate for service provider

is as follows.

The user U will send to the service provide. When S receives

this packet, it can verify that the cert is generated by O since only they know secret

key Kos. S will make sure that the user name and request index in cert match the

values in the packet. If the ACM index in cert is smaller than the value that S receives

from O, some changes to the access control matrix have happened and S will notify U

to get a new cert. Otherwise ,the service provider will retrieve the encrypted data

blocks and conduct the over encryption as follows. Using seed as the initial state of

P( ) the function will generate a long sequence of pseudo random bits. S will use this

bit sequence as one time pad and conduct the xor operation to encrypt the blocks. The

computation results will be sent to U.

When U receives the data blocks it will use seed to generate to pseudo random bit

sequence and use K` to derive the encryption keys then the data blocks are recovered

28

Page 29: DESSERTATION FINAL REPORT

When an end user U loses access to some data blocks, the access control matrix at O

will be updated. This will be sent to S through a secure channel. If U presents the old

cert to S it will be rejected since the ACM index value is invalid. However U can still

get access to the data blocks by eavesdropping on the traffic between S and other end

users if it has kept a copy of the key set K’. To defend such attacks service provider

can conduct over-encryption before sending out the data blocks. Since for every data

request the seed is dynamically generated by O and never transmitted in plaintext U

will not be able to regenerate the bit sequence of other end users. Therefore unless U

keeps a copy of the data blocks from previous access, it will not be able to get the

information.

3.2 Dynamics in User Access Rights

In lazy revocation it is assumed that it is acceptable for revoked users to read

unmodified data blocks and however it must not be able to read updated blocks .Lazy

revocation trades re-encryption and data access overhead for a degree of security.

When the access rights to data block Di of the user U is revoked, the access control

matrix in O will be updated and the ACM index increased .At the same time, O will

label this data block to show that some users access right has been revoked since its

last content update. Before Di has been updated next time, the owner will not change

the block on the outsourced storage. Since the ACM index value has been changed U

can no longer use its old cert to access Di. However when another user gets encrypted

Di through the network U can eavesdrop on the traffic. Since the service provider

refuses to conduct over encryption the data will be transmitted in the same format

29

Page 30: DESSERTATION FINAL REPORT

whoever the reader is. Therefore U should keep a copy of the encryption key, so that

it will get access to Di. This results however is the same as U has kept a copy of Di

before it access right is revoked.

When the owner needs to change the data block from Di to D`i it will check the label

and find that some uses access rights has been revoked. Therefore it cannot encrypt

the updated data block with the current key. The solution for this drawback is that the

owner will encrypt a control block with secret Kp,i the updated data is stored. When a

user receives this control block from the service provider, it will submit it to the

owner. The owner will derive the new key and send it back to the user. At the same

time a new cert will be generated so that the user get the new block from the service

provider. A revoked user will be able to get access to the control block. However the

owner will not send the new encryption key and cert to it .Therefore the revoked user

cannot get access to the updated data.

3.3 Dynamics in Outsourced Data

When a data block Di is deleted from the outsourced data, the owner will use a special

control block to replace Di. The special block will be encrypted by Kpi and stored at

the original slot for Di on the service provider. At the same time, the owner will label

its access control matrix to show that the block no longer exists. The end users can

still access this control block but they will not get any useful information from the

contents.

30

Page 31: DESSERTATION FINAL REPORT

The block data updating can be conducted as follows when the owner needs to update

Di, it will use Kp,I to encrypt the control block and store it in the i-th block of the

outsourced data. The control block will contain (1) (2p+i) which is the index of the

block in which D’I is stored.(2)x which is the number of times that Di has been

updated and (3) which is used to protect the integrity of the

control block. The owner will encrypt D’I with and the store the result in

the block with index number (2p+1).

Handling updates to data blocks

When user U needs to access the updated data block Di’ it will first get the encrypted

control block from S and submit it to the data owner. The owner will use the secret k

verify to examine the integrity of the control block. It will then use K’01 and x to derive

the encryption key of D01. The owner will return the encryption key and a new cert to

U though the secure communication channel between them. U will then get D01 from

the service provider. This method has several properties all meta data is stored in the

control block on the service provider so that the data owner only needs to store two

secret K’01 K verify since k verify is known to only the owner, attackers cannot generate

fake control blocks, every time the data block Di from the service provider.

31

Page 32: DESSERTATION FINAL REPORT

The data blocks that are always accessed together should be given sequential block

index number so that the owner can derive a smaller access key set K` for users. The

owner can reserve some empty slots in the outsourced data and later it can insert new

data into these positions based on their access patterns.

3.4 Analysis of Overhead:

The proposed approach introduces very limited storage overhead. The key derivation

mechanism allows the owner O to store only the root keys of the hierarchies. The end

user U does not need to pre-calculate and store all data block encryption keys. On the

contrary it can calculate the keys on the fly when it is conducting the data block

decryption operations. The service provider S needs to store an extra copy of the

updated data blocks when the data update rate is very low in the application

environment, the extra storage overhead at S is also low compared to the size of the

outsourced data.[15].

3.5 Conclusion:

In this paper the authors proposed a mechanism to achieve secure and efficient access

to outsourced data in owner-write-users read applications. Assuming that the

outsourced data has a very large scale and tried to reduce the overhead at the data

owner and service provider. It is proposed to encrypt every data block with a

different key so that flexible cryptography- based access control can be achieved.

Through the adoption of key derivation method, the owner needs to maintain only a

few secrets. Analysis shows that the key derivation procedure based on hash functions

will introduce very limited overhead. Over-encryption and/or lazy revocation to

32

Page 33: DESSERTATION FINAL REPORT

prevent revoked users from getting access to updated data blocks worked on

mechanisms to handle both updates to outsourced data and changes in user access

rights analyzed the computational, storage, and communication overhead of the

approach. and also investigated the scalability and safety of the approach.[15]

3.6 Summary

Integrating several advanced techniques to secure and efficient access to large scale

outsourced data in cloud computing by encrypting every data block with a different

symmetric key and adopting key derivation method and to provide fine grained access

control the outsourced data with flexible and efficient management by reducing the

risk of owner where he needs only few secrets for key derivation and it does not need

to access the storage server except for data updates, by constructing the key hierarchy

and key derivation procedures and mechanisms to handle dynamics in outsourced data

blocks overhead of the proposed approach was investigated for data retrieval from

scientific databases[15].

Chapter 4.0 Data Security Model for Cloud Computing:

In the cloud computing environment, the traditional access control mechanism has

serious shortcomings and built with new architecture which is compromised with

Hadoop, Hbase technologies which enhances the performance of the cloud systems

but brings in risks at the same time.

33

Page 34: DESSERTATION FINAL REPORT

By analyzing of HDFS data security needs of cloud computing is divided as the client

authentication requirements in login. The vast majority of cloud computing through a

browser client and the user’s identity as a cloud computing applications demand the

primary needs. If namenode is attacked or failure there will be disastrous

consequences on the system, so the effectiveness of namenode in cloud computing

and its efficiency is key to the success of data protection so to enhance namenodes

security is very important.

HDFS Architecture

As Datanode is data storage node, there is the possibility of failure and can not

guarantee the availability of data .currently each data storage block in HDFS has at

least three replicas, which is HDFS backup strategy. when comes to how to ensure the

safety reading and writing data, HDFS has not made any detailed explanation so the

needs to ensure rapid recovery and to make reading and writing data operation fully

controllable can not be ignored. In addition access control, file encryption, such as

demand for cloud computing model foe data security issues must be taken into

account.

34

Page 35: DESSERTATION FINAL REPORT

All the data security techniques is built on confidentiality, integrity and availability of

these three basic principles. Confidentiality refers to so called hidden the actual data

or information and in cloud computing the data is stored in data centres security and

confidentiality is more important. The integrity of data in any state is not subject to

the need to guarantee unauthorized deletion, modification or damage.

Data model of cloud computing can be described in math as

35

Page 36: DESSERTATION FINAL REPORT

Cloud Computing Data Security Mode

The model used three-level defence system structure in which each floor performs its

own duty to ensure that the data security of cloud layers.

The first layer is responsible for user authentication, the user of digital certificates

issued by the appropriate, and manager user permissions.

The second layer is responsible for users data encryption and protect the privacy of

users through a certain way.

The third layer is the user data for fast recovery system protection is the last layer of

user data with three level structures, user authentication is used to ensure that data is

not tampered. The user authenticated can manage, the data by operations: add,

modify, delete and so on. If the user authentication system is deceived by illegal

means, and malign user enters the system file encryption and privacy protection can

provide this level of defence. In this the layer user data is encrypted, even is the key

was the illegally accessed, through privacy protection, malign user will still not

unable to obtain effective access to information, which is very important to protect

business users trade secrets in cloud computing. Finally the rapid restoration of files

36

Page 37: DESSERTATION FINAL REPORT

layer, through fast recovery algorithm, makes user data be able to get the maximum

recovery even in case of damage.

Hence the cloud computing model for data security is designed [25].

37

Page 38: DESSERTATION FINAL REPORT

4.1 Conclusion

The data security becomes more important in cloud computing with analysis of HDFS

architecture, the data security requirement of cloud computing and a set up

mathematical model is designed.[25]

4.2Summary:

Cloud computing environment is a dynamic environment where the uses data

transmits from the data centre to the user’s client and data of the user’s changes all the

time The HDFS used in large scale cloud computing in typical distributed file system

architecture

All the data security technique is built on confidentiality, integrity and availability

taking them in consideration a mathematical data model is designed [25].

38

Page 39: DESSERTATION FINAL REPORT

Chapter 5.0 Ensuring Data Storage Security in cloud computing

In [27]described a formal “proof of retrievability”(POR) model for ensuring the

remote data integrity. Their scheme combines spot-checking and error-correcting code

to ensure both possession and retrievability of files on archive service systems.

[28]built on this model and constructed a random linear function based homomorphic

authenticator which enables unlimited number of queries and requires less

communication overhead. [29]Proposed an improved framework for POR protocols

that generalizes both [27] and [28] work. Later in their subsequent work [29] extended

POR model to distributed systems. However, all these schemes are focusing on static

data. The effectiveness of their schemes rests primarily on the pre-processing steps

that the user conducts before outsourcing the data file F. Any change to the contents

of F, even few bits, must propagate through the error-correcting code, thus

introducing significant computation and communication complexity. [30] define the

“provable data possession” (PDP) model for ensuring possession of file on untrusted

storages. Their scheme utilized public key based homomorphic tags for auditing the

data file, thus providing public verifiability. However, their scheme requires sufficient

computation overhead that can be expensive for an entire file. In their subsequent

work, [31] described a PDP scheme that uses only symmetric key cryptography. This

method has lower-overhead than their previous scheme and allows for block updates,

39

Page 40: DESSERTATION FINAL REPORT

deletions and appends to the stored file, which has also been supported in our work.

However, their scheme focuses on single server scenario and does not address small

data corruptions, leaving both the distributed scenario and data error recovery issue

unexplored. [32] aimed to ensure data possession of multiple replicas across the

distributed storage system. They extended the PDP scheme to cover multiple replicas

without encoding each replica separately, providing guarantees that multiple copies of

data are actually maintained. In other related work, [33]presented a P2P backup

scheme in which blocks of a data file are dispersed across m+k peers using an

(m+k,m)-erasure code. Peers can request random blocks from their backup peers and

verify the integrity using separate keyed cryptographic hashes attached on each block.

Their scheme can detect data loss from free riding peers, but does not ensure all data

is unchanged. [34] proposed verify data integrity using RSA-based hash to

demonstrate uncheatable data possession in peer-to-peer file sharing networks.

However, their proposal requires exponentiation over the entire data file, which is

clearly impractical for the server whenever the file is large. [35] Proposed allowing a

TPA to keeps online storage honest by first encrypting the data then sending a number

of pre computed symmetric-keyed hashes over the encrypted data to the auditor.

However, their scheme only works for encrypted files, and auditors must maintain

long-term state.[36]Proposed to ensure file integrity across multiple distributed

servers, using erasure-coding and block-level file integrity checks. However, their

scheme only considers static data files and do not explicitly study the problem of data

error localization, which in our approach considering this work.

40

Page 41: DESSERTATION FINAL REPORT

In this approach an effective and flexible distributed scheme with explicit dynamic

data support to ensure the correctness of user’s data in the cloud is proposed. Erasure

correcting code in the file distribution preparation is to provide redundancies and to

guarantee the data dependability, by which this construction drastically reduces the

communication overhead. To achieve the storage insurance as well as data error

localization homomorphic token with distributed verification of erasure-coded data is

utilized.

The main idea is as followed before file distribution the user pre computes a certain

number of short verification tokens on individual vector G(j)(jϵ{1,…..n}) each token

covering a random subset of data blocks. To ensure the storage correctness the use

challenges the provider with a set of randomly generated block indices and the sever

computes a short signature over the specified blocks and returns to the user and it is

compared with pre computed tokens to match by the user. The requested response

values of integrity check must also be valid codeword determined by secret matrix P.

5.1 Challenge Token Preparation

When user wants to challenge the cloud servers t times to ensure the correctness of

data storage. Then user must pre compute t verification tokens for each G(j)(jϵ{1…

n})using a PRFf(.) a PRP Ф(.) a challenge key kchal and a master permutation key KPRP.

To generate the ith token of server J the user acts as follows.

41

Page 42: DESSERTATION FINAL REPORT

To derive a random challenge value αi of GF(2p) by αi=fkchal(i) and a permutation key

KiPRP based on KPRP.

Compute the set of r randomly chosen indices:

Token is calculated as

Vi(j) which is an element of GF(2p) with small size, is the response the user expects to

receive from the server j. when the user challenges it on the specified data blocks.

Once all tokens are computed the final step before file distribution is to blind each

parity block. Gi(i) in

where Kj is the secret key for parity vector G(j)(jϵ{1,….n}) across the cloud servers

S1,S2,S3,,,,,Sn

5.2Correctness Verification and Error Localization

The user reveals the αi as well as the I-th permutation key Kprpi to each servers.

The server storing vector G(i) aggregates those r rows specified by index Kprp into a

linear combination.

42

Page 43: DESSERTATION FINAL REPORT

Upon receiving Ri(i) from all the servers, the users takes away blind values in

As all the servers operate over the same subset of indices the linear aggregation of

these r specified rows has to be codeword in the encoded file matrix.

Once the inconsistency among the storage has been detected by relying on pre

computed verification tokens to further determine where the potential data errors lies

in. Each response Ri(j) is computed exactly in the same way as token vi

(j) thus user can

simply find which server is misbehaving by verifying the following n equations.

Algorithm gives the details of correctness verification and error localization.

43

Page 44: DESSERTATION FINAL REPORT

5.3 File Retrieval and Error Recovery

Since layout considered here is systematic the user can reconstruct the original file by

downloading the data vectors from the first m servers, assuming that they return the

correct response values. This verification scheme is based on random spot-checking

so the storage correctness assurance is a probabilistic one, by choosing system

parameters (e.g.r.t.l) appropriately and conducting enough times of verification, file

retrieval can be guaranteed. Whenever the data corruption is detected the comparison

of pre computed tokens and receive response values can guarantee the identification

of misbehaving servers. The user can always ask the servers to send back blocks of

the row r specified in the challenge and regenerate the correct blocks by erasure

correction .The newly recovered blocks can then be redistributed to the misbehaving

servers to maintain the correctness of storage.

Algorithm for Error Recovery

44

Page 45: DESSERTATION FINAL REPORT

5.4 Providing Dynamic Data Operation Support:

5.4.1 Update Operation

Due to the linear property of reed Solomon code a user can perform the update

operation and generate the updated parity blocks by using Δ fij only without involving

any other unchanged block .general update matrix ΔF as:

Zero elements are used in ΔF to denote the unchanged blocks. To maintain the

corresponding parity vectors as well as be consistent with the original file layout, the

user can multiply ΔF by A and thus update information for both vectors and parity

vectors is generated.

Where denotes the update information for the parity

vector G(j).

5.4.2 Delete and Insert Operation: It is special case of update operation where the

original data blocks can be replaced with zeros or some predetermined special blocks.

By setting Δfij in ΔF

To be the updated parity information has to be blinded using the same method

specified in update operation.

45

Page 46: DESSERTATION FINAL REPORT

An insert operation may affect many rows in the logical data file matrix F and a

substantial number of computations are required to renumber all the subsequent

blocks as well recomputed the challenge response tokens.

5.4.3Append Operation: If user wants to append the m blocks at the end of file F

denoted as

With the secret matrix P user can directly calculate the

append blocks for each parity server as

When the user is ready to append new blocks both the file blocks and the

corresponding parity blocks are generated the total length of each vector G(j)

Will be increased and fall into the range [l, lmax].Therefore the user will update those

affected tokens by adding to old Vi Whenever

.

The parity blinding is similar as introduced in update operation.

Through detailed security and performance analysis is it shown that that this scheme

is highly efficient and resilient to Byzantine failures, malicious data modification

attack and even server colluding attacks [9].

46

Page 47: DESSERTATION FINAL REPORT

5.5 Conclusion

In this paper, the problem of data security in cloud data storage is investigated, which

is essentially a distributed storage system. To ensure the correctness of users’ data in

cloud data storage, an effective and flexible distributed scheme with explicit dynamic

data support, including block update, delete, and append is proposed. Relying on

erasure-correcting code in the file distribution preparation to provide redundancy

parity vectors and guarantee the data dependability. By utilizing the

homomorphic token with distributed verification of erasurecoded data, our scheme

achieves the integration of storage correctness insurance and data error localization,

i.e., whenever data corruption has been detected during the storage correctness

verification across the distributed servers, it is almost guarantee the simultaneous

identification of the misbehaving server(s). Through detailed security and

performance analysis, it is shown that this scheme is highly efficient and resilient to

Byzantine failure, malicious data modification attack, and even server colluding

attacks. [9]

5.6 Summary

The Authors proposed an effective and flexible distributed scheme with explicit

dynamic data support to ensure the correctness of user’s data in the cloud by utilizing

homomorphic token system with distributed verification of erasure-coded data,

worked on integration of storage correctness insurance and data error localization.

Further this proposed scheme supports secure and efficient dynamic operations on

data blocks, including data update, delete and append. From analysis proposed

47

Page 48: DESSERTATION FINAL REPORT

scheme is efficient and resilient against Byzantine failure, malicious data modification

attack and even secure on colluding attacks.

The authors aim to designing efficient mechanism for dynamic data verification and

operation to achieve storage correctness, fast localization of data error, dynamic data

support, minimizing the effect brought by data errors or secure failures.

The authors reviewed basic tools from coding theory which are needed for file

distribution and homomorphic token system which belongs to universal hash function

to preserve homomorphic properties and it is also perfectly integrated with

verification of erasure-coded data and also derived challenge response protocol to

verify the storage correctness as well as to identify misbehaving servers.[9]

48

Page 49: DESSERTATION FINAL REPORT

Chapter 6

Critical Appraisal, Recommendations and suggestions for future work

The proposed approach in chapter 3 is to encrypt every data block with a different key

so that to achieve cryptography based access control flexibly. Owner has to maintain

only a few secrets by adopting key derivation methods. From analysis the key

derivation procedure using hash function will introduce very limited computational

overhead. The approach provides fine grain access control to outsourced data with

flexible and efficient management and does not need to access the storage server

except for data updates .A comprehensive mechanism to introduced to handle

dynamics in user access rights and updates to outsourced data and this mechanism

does not depend on any specific encryption algorithm to end users and cam make their

own choices based on the requirement of the application. The key derivation tree

structure will allow data consumer to use a few keys to generate all secrets in need.

The key distribution and update problem is beyond this approach and considered only

the simple case of outsourced data with a single owner and can be extended to the

scenarios in which the data has multiple owners and where each of them can choose

data blocks independently.

To maintain data consistency we should orderly execution of the update operations

when owners want to change the data contents, this can be achieved through

semaphore flag at the service provider which is not discussed in this approach [15].

49

Page 50: DESSERTATION FINAL REPORT

The proposed approach from chapter 4 Cloud computing environment is a dynamic

environment where the uses data transmits from the data centre to the user’s client and

data of the user’s changes all the time The HDFS used in large scale cloud computing

in typical distributed file system architecture

All the data security technique is built on confidentiality, integrity and availability

taking them in consideration a mathematical data model is designed [25].

The proposed approach in chapter 5 Utilized the homomorphic token with distributed

verification of erasure coded data to achieve the integration of storage correctness

insurance and data error localization, That is to identify misbehaving servers and

further supports to secure and efficient dynamic operations on data blocks, including

data update, delete and append. This construction drastically reduces the

communication and storage overhead as compared to the traditional replication based

file distribution techniques. Extensive security and performance analysis shows that

the proposed scheme is highly efficient and resilient against Byzantine failure,

malicious data modification attack and even server colluding attacks.

Assumed point-to-point communication channels between each cloud servers and the

user is authenticated and reliable which can be achieved in practice with little

overhead but multipoint communication is not considered. Issue of data privacy is not

addressed as in cloud computing data privacy is orthogonal to the proposed approach.

An efficient inset operation is difficult to support in the given approach as It may

affect rows in the logical data file matrix and a substantial number of computations

50

Page 51: DESSERTATION FINAL REPORT

are required to renumber all the subsequent blocks as well as re-compute the

challenges response tokens [9]

6.1Future work:

To study semaphore flag at the service provider from operating systems and

distributed database for access to shared resources. Work on new key management

schemes to apply for write-many-read applications.

To work on efficient insert operation in dynamic data and public verifiable models

and dynamic cloud data storages, Fine grained data error localization.

After comparing the three approaches that none of the above were not sure to secure

data storage in cloud computing as an area full of challenges and of paramount

importance is still in its infancy and working on data model architecture is to be

considered to secure data in cloud computing.

51

Page 52: DESSERTATION FINAL REPORT

Chapter 7

Conclusions

Reviewed on basic tools of coding like homomorphic tokens that in needed for file

distribution across cloud servers, analysis are studied from the proposed approaches

for computational, storage and communication overhead of a data access operation

and to prevent revoked users from getting access to outsourced data through

eavesdropping, efficient key management methods like key derivation hierarchy.

A mathematical data model is designed.

52

Page 53: DESSERTATION FINAL REPORT

Appendix-A

Securing Data Storage in Cloud Computing

Anil Kumar Mysa E-mail: [email protected]

Supervisor: Dr. Nicholous Inonnides

Computer Networking

London Metropolitan University

London U.K

ABSTRACT

Cloud computing has become a significant

technology trend and many experts expect it to

reshape information technology processes and the

IT market place in next few years. Security and data

security becomes more important when using clod

computing at all levels like Infrastructure as a

service (IaaS) Platform as a Service (PaaS) and

Software as service (SaaS). A major reason for the

lack of effective data security is the simply the

limitations of current encryption capabilities. In this

paper a novel techniques were discussed like Key

derivation method, homomorphic tokens and backup

assisted revocation schemes are proposed to achieve

stronger data security storage correctness, fast

localization of data error. This is followed by a

discussion of the performance evaluation and an

outlook into future

INTRODUCTION

Cloud computing is not an innovation but a means

to constructing IT services that use advanced

computational power and improved storage

capabilities it has drawn the attention of major

industrial companies, scientific communities as well

as user groups. The main focus of cloud computing

from the providers view as extraneous hardware

connected to support downtime on any device in the

network. Critics argue that cloud computing is not

secure enough data leaves companies local area

networks. It is up to the clients to decide the

vendors, depending on how willing they are to

implement secure policies and be subject to third

party verifications.

Encryption is a well known approach to addressing

these types security threats for protection in the

cloud, the enterprise would need to encrypt all data

and communications while it is not that difficult to

add encryption software initially to the application

environment, the new configuration requires

ongoing management and maintenance. And in

order to run the application in the cloud, the

enterprise needs to deliver the encryption keys to the

cloud to decrypt the data, creating additional

security risks by exposing the keys in the operating

53

Page 54: DESSERTATION FINAL REPORT

environment. Recent advances in cryptography

could mean that future cloud computing services

will not only be able to encrypt documents to keep

them safe in cloud but also make it possible to

search and retrieve this information without first

decrypting it.

Encrypted search architectures and tools have been

developed by groups at several universities and

companies. Though there are a variety of different

approaches, most technologies encrypt data file as

well as tags called metadata that describe the

contents of those files and issue a master key to the

user. The token used to search through encrypted

data contains functions that are able to find matches

to metadata attached to certain files and then return

the encrypted files to the user. Once the user has the

file, he can use his master decryption “key” to

decrypt it.

Firstly, traditional cryptographic primitives for the

purpose of data security protection can not be

directly adopted due to the users’ loss control of

data under cloud Computing. Therefore, verification

of correct data storage in the cloud must be

conducted without explicit knowledge of the whole

data. Considering various kinds of data for each user

stored in the cloud and the demand of long term

continuous assurance of their data safety, the

problem of verifying correctness of data storage in

the cloud becomes even more challenging.

Secondly, Cloud Computing is not just a third party

data warehouse. The data stored in the cloud may be

frequently updated by the users, including insertion,

deletion, modification, appending, reordering, etc.

To ensure storage correctness under dynamic data

update is hence of paramount importance. However,

this dynamic feature also makes traditional integrity

insurance techniques futile and entails new

solutions. Last but not the least, the deployment of

Cloud Computing is powered by data centres

running in a simultaneous, cooperated and

distributed manner. Individual user’s data is

redundantly stored in multiple physical locations to

further reduce the data integrity threats.

Therefore, distributed protocols for storage

correctness assurance will be of most importance in

achieving a robust and secure cloud data storage

system in the real world. However, more research

efforts are needed to achieve flexible access control

to large scale dynamic data.

To enable secure and efficient access to outsourced

data, investigators have tried to integrate key

derivation mechanisms [16, 17, 18,19] h encryption-

based data access control.[20 propose a generic

method that uses only hash functions to derive a

descendant's key in a hierarchy. The method can

handle updates locally and avoid propagation.

Although the proposed key derivation tree structure

can be viewed as a special case of access

hierarchies, from analysis it is shown that the

proposed method serves the studied application

better. In [21], the authors divide users into groups

based on their access rights to the data. The users

are then organized into a hierarchy and further

transformed to a tree structure to reduce the number

of encryption keys. This method also helps to

reduce the number of keys that are given to each

user during the initiation procedure. In [22], data

records are organized into groups based on the users

that can access them. Since the data in the same

group are encrypted by the same key, changes to

user access rights will lead to updates in data

organization. While a creative idea in this approach

is to allow servers to conduct a second level

encryption (over-encryption) to control access,

repeated access revocation and grant may lead to a

very complicated hierarchy structure for key

management. In [23], the approach will store

multiple copies of the same data record encrypted

by different keys. At the same time, when access

rights change, reencryption and data updates to the

server must be conducted. These operations will

cause extra overhead on the server and do not fit

into proposed approach. An experimental evaluation

of these approaches can be found in [24].

The basic idea is to generate the data block

encryption keys through a hierarchy. Every key in

the hierarchy can be derived by combining its parent

node and some public information. As the derivation

procedure uses a one-way function, secret keys of

the parent node and sibling nodes cannot be

54

Page 55: DESSERTATION FINAL REPORT

calculated, In this way the data owner will need to

maintain only the root nodes of the hierarchy and

during the key distribution procedure the owner can

send the secrets in the hierarchy to end users based

on their access rights. The end users will derive the

leaf nodes in the hierarchy to decrypt the data

blocks.

Key Derivation Hierarchy

By assuming that the out sourced data contain n

blocks and 2 P-1 <=n<2P a binary tree structure with

height p can be built. The data owner will choose a

root secret K 0,1 which is the first index of key

represents its level in the hierarchy and second

index represents its sequence in the level and so on.

The data owner chooses a public hash function h( )

for any node K ij in the hierarchy its child can be

calculated as

by

sandwiching the sequence number of the child node

with the parents key and then hash function is

applied. Similarly right child of Kij can be

calculated, A node can calculate the secrets of all its

descendants applying this function repeatedly,

reaching level P of the hierarchy the hash results can

be used as keys to encrypt the data blocks.

Data Access Procedure

To prevent revoked users from getting access to

outsourced data through eavesdropping. The service

provider will conduct over encryption when it sends

data blocks to end users, service provider and end

users share a pseudo random bit sequence generator

p( ) .

Representing :

O data owner,

S service provider,

U end user.

Since only U and O know Kou, O will be able to

authenticate the sender .The request index will be

increased by 1 every time U send out a request and

it is used by O to defend against replay attacks. The

request contains the index numbers of the data

blocks that U wants to access. The message

Authentication Code (MAC) will protect the

integrity of the packet.

When O receives this message, it will authenticate

the sender and verify the integrity and it will then

examine its access control matrix and make sure that

U is authorised to read all blocks in the request. If

the request passes this check the owner will

determine the smallest set of keys K` in the key such

that (1) K’ can derive the keys that are used to

encrypt the requested data blocks.(2)U is authorized

to know all keys that can be derived from K’and can

be determined by an algorithm.

The owner will then generate the reply to the end

user.

The ACM index is used by O to lable the freshness

of the Access Control Matrix(ACM).This index will

be increased by 1 every time O changes some end

users access rights. The updated ACM index will be

sent to S by O to prevent those revoked users from

using old certificates to access data blocks. The seed

is a random number to initiate P( ) so that U can

decrypt the over encryption conducted by S. U will

use K` to derive the data block encryption keys. The

55

Page 56: DESSERTATION FINAL REPORT

format of certificate for service provider is as

follows.

The user U will send

to the service

provide. When S receives this packet, it can verify

that the cert is generated by O since only they know

secret key Kos. S will make sure that the user name

and request index in cert match the values in the

packet. If the ACM index in cert is smaller than the

value that S receives from O, some changes to the

access control matrix have happened and S will

notify U to get a new cert. Otherwise ,the service

provider will retrieve the encrypted data blocks and

conduct the over encryption as follows. Using seed

as the initial state of

P( ) the function will generate a long sequence of

pseudo random bits. S will use this bit sequence as

one time pad and conduct the xor operation to

encrypt the blocks. The computation results will be

sent to U.

When U receives the data blocks it will use seed to

generate to pseudo random bit sequence and use K`

to derive the encryption keys then the data blocks

are recovered

When an end user U loses access to some data

blocks, the access control matrix at O will be

updated. This will be sent to S through a secure

channel. If U presents the old cert to S it will be

rejected since the ACM index value is invalid.

However U can still get access to the data blocks by

eavesdropping on the traffic between S and other

end users if it has kept a copy of the key set K’. To

defend such attacks service provider can conduct

over-encryption before sending out the data blocks.

Since for every data request the seed is dynamically

generated by O and never transmitted in plaintext U

will not be able to regenerate the bit sequence of

other end users. Therefore unless U keeps a copy of

the data blocks from previous access, it will not be

able to get the information.

Dynamics in User Access Rights

In lazy revocation it is assumed that it is acceptable

for revoked users to read unmodified data blocks

and however it must not be able to read updated

blocks .Lazy revocation trades re-encryption and

data access overhead for a degree of security.

When the access rights to data block Di of the user

U is revoked, the access control matrix in O will be

updated and the ACM index increased .At the same

time, O will label this data block to show that some

users access right has been revoked since its last

content update. Before Di has been updated next

time, the owner will not change the block on the

outsourced storage. Since the ACM index value has

been changed U can no longer use its old cert to

access Di. However when another user gets

encrypted Di through the network U can eavesdrop

on the traffic. Since the service provider refuses to

conduct over encryption the data will be transmitted

in the same format whoever the reader is. Therefore

U should keep a copy of the encryption key, so that

it will get access to Di. This results however is the

same as U has kept a copy of Di before it access

right is revoked.

When the owner needs to change the data block

from Di to D`i it will check the label and find that

some uses access rights has been revoked. Therefore

it cannot encrypt the updated data block with the

current key. The solution for this drawback is that

the owner will encrypt a control block with secret

Kp,i the updated data is stored. When a user receives

this control block from the service provider, it will

submit it to the owner. The owner will derive the

new key and send it back to the user. At the same

time a new cert will be generated so that the user get

the new block from the service provider. A revoked

user will be able to get access to the control block.

However the owner will not send the new

encryption key and cert to it .Therefore the revoked

user cannot get access to the updated data.

56

Page 57: DESSERTATION FINAL REPORT

Dynamics in Outsourced Data

When a data block Di is deleted from the outsourced

data, the owner will use a special control block to

replace Di. The special block will be encrypted by

Kpi and stored at the original slot for Di on the

service provider. At the same time, the owner will

label its access control matrix to show that the block

no longer exists. The end users can still access this

control block but they will not get any useful

information from the contents.

The block data updating can be conducted as

follows when the owner needs to update Di, it will

use Kp,I to encrypt the control block and store it in

the i-th block of the outsourced data. The control

block will contain (1) (2p+i) which is the index of

the block in which D’I is stored.(2)x which is the

number of times that Di has been updated and (3)

which is used to

protect the integrity of the control block. The owner

will encrypt D’I with and the store

the result in the block with index number (2p+1).

Handling updates to data blocks

When user U needs to access the updated data block

Di’ it will first get the encrypted control block from

S and submit it to the data owner. The owner will

use the secret k verify to examine the integrity of the

control block. It will then use K’01 and x to derive

the encryption key of D01. The owner will return the

encryption key and a new cert to U though the

secure communication channel between them. U

will then get D01 from the service provider. This

method has several properties all meta data is stored

in the control block on the service provider so that

the data owner only needs to store two secret K’01 K

verify since k verify is known to only the owner,

attackers cannot generate fake control blocks, every

time the data block Di from the service provider.

The data blocks that are always accessed together

should be given sequential block index number so

that the owner can derive a smaller access key set

K` for users. The owner can reserve some empty

slots in the outsourced data and later it can insert

new data into these positions based on their access

patterns.

Analysis of Overhead:

The proposed approach introduces very limited

storage overhead. The key derivation mechanism

allows the owner O to store only the root keys of the

hierarchies. The end user U does not need to pre-

calculate and store all data block encryption keys.

On the contrary it can calculate the keys on the fly

when it is conducting the data block decryption

operations. The service provider S needs to store an

extra copy of the updated data blocks when the data

update rate is very low in the application

environment, the extra storage overhead at S is also

low compared to the size of the outsourced data.

[15].

Data Security Model for Cloud Computing:

In the cloud computing environment, the traditional

access control mechanism has serious shortcomings

and built with new architecture which is

compromised with Hadoop, Hbase technologies

which enhances the performance of the cloud

systems but brings in risks at the same time.

By analyzing of HDFS data security needs of cloud

computing is divided as the client authentication

requirements in login. The vast majority of cloud

57

Page 58: DESSERTATION FINAL REPORT

computing through a browser client and the user’s

identity as a cloud computing applications demand

the primary needs. If namenode is attacked or

failure there will be disastrous consequences on the

system, so the effectiveness of namenode in cloud

computing and its efficiency is key to the success of

data protection so to enhance namenodes security is

very important.

HDFS Architecture

As Datanode is data storage node, there is the

possibility of failure and can not guarantee the

availability of data .currently each data storage

block in HDFS has at least three replicas, which is

HDFS backup strategy. when comes to how to

ensure the safety reading and writing data, HDFS

has not made any detailed explanation so the needs

to ensure rapid recovery and to make reading and

writing data operation fully controllable can not be

ignored. In addition access control, file encryption,

such as demand for cloud computing model foe data

security issues must be taken into account.

All the data security techniques is built on

confidentiality, integrity and availability of these

three basic principles. Confidentiality refers to so

called hidden the actual data or information and in

cloud computing the data is stored in data centres

security and confidentiality is more important. The

integrity of data in any state is not subject to the

need to guarantee unauthorized deletion,

modification or damage.

Data model of cloud computing can be described in

math as

Cloud Computing Data Security Mode

The model used three-level defence system structure

in which each floor performs its own duty to ensure

that the data security of cloud layers.

The first layer is responsible for user authentication,

the user of digital certificates issued by the

appropriate, and manager user permissions.

The second layer is responsible for users data

encryption and protect the privacy of users through

a certain way.

58

Page 59: DESSERTATION FINAL REPORT

The third layer is the user data for fast recovery

system protection is the last layer of user data with

three level structures, user authentication is used to

ensure that data is not tampered. The user

authenticated can manage, the data by operations:

add, modify, delete and so on. If the user

authentication system is deceived by illegal means,

and malign user enters the system file encryption

and privacy protection can provide this level of

defence. In this the layer user data is encrypted,

even is the key was the illegally accessed, through

privacy protection, malign user will still not unable

to obtain effective access to information, which is

very important to protect business users trade

secrets in cloud computing. Finally the rapid

restoration of files layer, through fast recovery

algorithm, makes user data be able to get the

maximum recovery even in case of damage.

Hence the cloud computing model for data security

is designed [25].

Ensuring Data Storage Security in cloud

computing

In this approach an effective and flexible distributed

scheme with explicit dynamic data support to ensure

the correctness of user’s data in the cloud is

proposed. Erasure correcting code in the file

distribution preparation is to provide redundancies

and to guarantee the data dependability, by which

this construction drastically reduces the

communication overhead. To achieve the storage

insurance as well as data error localization

homomorphic token with distributed verification of

erasure-coded data is utilized.

The main idea is as followed before file distribution

the user pre computes a certain number of short

verification tokens on individual vector G(j)(jϵ{1,

…..n}) each token covering a random subset of data

blocks. To ensure the storage correctness the use

challenges the provider with a set of randomly

generated block indices and the sever computes a

short signature over the specified blocks and returns

to the user and it is compared with pre computed

tokens to match by the user. The requested response

values of integrity check must also be valid

codeword determined by secret matrix P.

Challenge Token Preparation

When user wants to challenge the cloud servers t

times to ensure the correctness of data storage. Then

user must pre compute t verification tokens for each

G(j)(jϵ{1…n})using a PRFf(.) a PRP Ф(.) a

challenge key kchal and a master permutation key

KPRP. To generate the ith token of server J the user

acts as follows.

To derive a random challenge value αi of GF(2p) by

αi=fkchal(i) and a permutation key KiPRP based on KPRP.

Compute the set of r randomly chosen indices:

Token is calculated as

Vi(j) which is an element of GF(2p) with small size,

is the response the user expects to receive from the

59

Page 60: DESSERTATION FINAL REPORT

server j. when the user challenges it on the specified

data blocks.

Once all tokens are computed the final step before

file distribution is to blind each parity block. Gi(i) in

where Kj is the secret key for parity vector G(j)(jϵ{1,

….n}) across the cloud servers S1,S2,S3,,,,,Sn

5.2Correctness Verification and Error Localization

The user reveals the αi as well as the I-th

permutation key Kprpi to each servers.

The server storing vector G(i) aggregates those r rows

specified by index Kprp into a linear combination.

Upon receiving Ri(i) from all the servers, the users

takes away blind values in

As all the servers operate over the same subset of

indices the linear aggregation of these r specified

rows has to be codeword in the encoded file matrix.

Once the inconsistency among the storage has been

detected by relying on pre computed verification

tokens to further determine where the potential data

errors lies in. Each response Ri(j) is computed

exactly in the same way as token vi(j) thus user can

simply find which server is misbehaving by

verifying the following n equations.

Algorithm gives the details of correctness

verification and error localization.

5.3 File Retrieval and Error Recovery

Since layout considered here is systematic the user

can reconstruct the original file by downloading the

data vectors from the first m servers, assuming that

they return the correct response values. This

verification scheme is based on random spot-

checking so the storage correctness assurance is a

probabilistic one, by choosing system parameters

(e.g.r.t.l) appropriately and conducting enough times

of verification, file retrieval can be guaranteed.

Whenever the data corruption is detected the

comparison of pre computed tokens and receive

response values can guarantee the identification of

misbehaving servers. The user can always ask the

servers to send back blocks of the row r specified in

the challenge and regenerate the correct blocks by

erasure correction .The newly recovered blocks can

then be redistributed to the misbehaving servers to

maintain the correctness of storage.

Providing Dynamic Data Operation Support:

Update Operation

Due to the linear property of reed Solomon code a

user can perform the update operation and generate

the updated parity blocks by using Δ fij only without

involving any other unchanged block .general

update matrix ΔF as:

60

Page 61: DESSERTATION FINAL REPORT

Zero elements are used in ΔF to denote the

unchanged blocks. To maintain the corresponding

parity vectors as well as be consistent with the

original file layout, the user can multiply ΔF by A

and thus update information for both vectors and

parity vectors is generated.

Where

denotes the update information for the parity vector

G(j).

5.4.2 Delete and Insert Operation: It is special case

of update operation where the original data blocks

can be replaced with zeros or some predetermined

special blocks. By setting Δfij in ΔF

To be the updated parity information has to be

blinded using the same method specified in update

operation.

An insert operation may affect many rows in the

logical data file matrix F and a substantial number

of computations are required to renumber all the

subsequent blocks as well recomputed the challenge

response tokens.

5.4.3Append Operation: If user wants to append the

m blocks at the end of file F denoted as

With the secret

matrix P user can directly calculate the append

blocks for each parity server as

When the user is ready to append new blocks both

the file blocks and the corresponding parity blocks

are generated the total length of each vector G(j)

Will be increased and fall into the range [l,

lmax].Therefore the user will update those affected

tokens by adding to old Vi

Whenever

.

The parity blinding is similar as introduced in

update operation.

Through detailed security and performance analysis

is it shown that that this scheme is highly efficient

and resilient to Byzantine failures, malicious data

modification attack and even server colluding

attacks [9].

Conclusion

The proposed approach in chapter 3 is to encrypt

every data block with a different key so that to

achieve cryptography based access control flexibly.

Owner has to maintain only a few secrets by

adopting key derivation methods. From analysis the

key derivation procedure using hash function will

introduce very limited computational overhead. The

approach provides fine grain access control to

outsourced data with flexible and efficient

management and does not need to access the storage

server except for data updates .A comprehensive

mechanism to introduced to handle dynamics in user

access rights and updates to outsourced data and this

mechanism does not depend on any specific

encryption algorithm to end users and cam make

their own choices based on the requirement of the

application. The key derivation tree structure will

allow data consumer to use a few keys to generate

all secrets in need.

The key distribution and update problem is beyond

this approach and considered only the simple case of

outsourced data with a single owner and can be

extended to the scenarios in which the data has

multiple owners and where each of them can choose

data blocks independently.

61

Page 62: DESSERTATION FINAL REPORT

To maintain data consistency we should orderly

execution of the update operations when owners

want to change the data contents, this can be

achieved through semaphore flag at the service

provider which is not discussed in this approach

[15].

The proposed approach from chapter 4 Cloud

computing environment is a dynamic environment

where the uses data transmits from the data centre to

the user’s client and data of the user’s changes all

the time The HDFS used in large scale cloud

computing in typical distributed file system

architecture

All the data security technique is built on

confidentiality, integrity and availability taking

them in consideration a mathematical data model is

designed [25].

The proposed approach in chapter 5 Utilized the

homomorphic token with distributed verification of

erasure coded data to achieve the integration of

storage correctness insurance and data error

localization, That is to identify misbehaving servers

and further supports to secure and efficient dynamic

operations on data blocks, including data update,

delete and append. This construction drastically

reduces the communication and storage overhead as

compared to the traditional replication based file

distribution techniques. Extensive security and

performance analysis shows that the proposed

scheme is highly efficient and resilient against

Byzantine failure, malicious data modification

attack and even server colluding attacks.

Assumed point-to-point communication channels

between each cloud servers and the user is

authenticated and reliable which can be achieved in

practice with little overhead but multipoint

communication is not considered. Issue of data

privacy is not addressed as in cloud computing data

privacy is orthogonal to the proposed approach.

An efficient inset operation is difficult to support in

the given approach as It may affect rows in the

logical data file matrix and a substantial number of

computations are required to renumber all the

subsequent blocks as well as re-compute the

challenges response tokens [9]

Future work:

To study semaphore flag at the service provider

from operating systems and distributed database for

access to shared resources. Work on new key

management schemes to apply for write-many-read

applications.

To work on efficient insert operation in dynamic

data and public verifiable models and dynamic

cloud data storages, Fine grained data error

localization.

After comparing the three approaches that none of

the above were not sure to secure data storage in

cloud computing as an area full of challenges and of

paramount importance is still in its infancy and

working on data model architecture is to be

considered to secure data in cloud computing.

References and Bibliography

[1]http://www.apache.org./docs/current/hdfs_design.html

[2]http;//www.ia.org/wiki/Reed–Solomon_error_correction

[3]http;//www.ia.org/wiki/Universal_hashing

[4]http://www.ia.org/wiki/Homomorphic_encryption

[5]http://www.a.org/wiki/Byzantine_fault_tolerance

[6]Cloud computingchallenges and related security issues by TraianAndrei

[7]http://www.cloudswitch.com/page/making-cloud-computing-secure-for-the-enterprise. By Eellen Rubin

[8]http;//www.technologyreview.com/searching an encrypted cloud by David Talbot

62

Page 63: DESSERTATION FINAL REPORT

[9]Ensuring Data Storage security in cloud computing by congwang,qianwang and kui ren,wenjing lou 978-1-

4244-3876-1/09/$25.00 ©2009 IEEE

[10] Ataxonomy and survey of cloud computing systems by Bhasker Prasad Rimal.Eunumi choi. Ian lumb.

[11] http://www.enisa.europa.eu/

[12] www.isaca.org

[13]on technical security issues in cloud computing by Meiko jenson, Jorg Schwenk Nils Gruschka,Luigi Lo

Iacono

[13]Privacy and security in cloud computing by tima mather ,kumaraswamy and shahed ali.page61-71

[14]cloud security alliance guide

[15]secure and efficient access to outsourced data by weichao wang Rodney owens Zhiweili and bharat bhargava

CCSW'09, November 13, 2009, Chicago, Illinois, USA.

Copyright 2009 ACM 978-1-60558-784-4/09/11

[16] T. Chen, Y. Chung, and C. Tian. A novel key anagement scheme for dynamic access control in a

user hierarchy. In IEEE Annual International computersftware and Applications Conference,

pages 396{401, 2004.

[17] H. Chien and J. Jan. New hierarchical assignment ithout public key cryptography. Computers &

Security, 22(6):523{526, 2003.

[18] C. Lin. Hierarchical key assignment without ublic-key cryptography. Computers & Security,

20(7):612{619, 2001.

[19] S. Zhong. A practical key management scheme for ccess control in a user hierarchy. Computers &

Security, 21(8):750{759, 2002.

[20] M. J. Atallah, M. Blanton, N. Fazio, and K. B. rikken. Dynamic and e_cient key management for

access hierarchies. ACM Trans. Inf. Syst. Secur.,

12(3):1{43, 2009.

[21] E. Damiani, S. D. C. di Vimercati, S. Foresti,S. Jajodia, S. Paraboschi, and P. Samarati. Key

management for multi-user encrypted databases. InProceedings of the ACM workshop on Storage securityand

survivability, pages 74{83, 2005.

[22] S. D. C. di Vimercati, S. Foresti, S. Jajodia,S. Paraboschi, and P. Samarati. Over-encryption:

management of access control evolution on outsourceddata. In Proceedings of the international conference onVery

large data bases, pages 123{134, 2007.

[23] S. D. C. di Vimercati, S. Foresti, S. Jajodia,S. Paraboschi, and P. Samarati. A data outsourcing

architecture combining cryptography and accesscontrol. In Proceedings of the ACM workshop on

Computer security architecture, pages 63{69, 2007.

[24] E. Damiani, S. De Capitani di Vimercati, S. Foresti,S. Jajodia, S. Paraboschi, and P. Samarati. An

Experimental Evaluation of Multi-Key Strategies for daa Outsourcing, IFIP International Federation for

Information Processing, Volume 232, New Approachesfor Security, Privacy and Trust in Complex

Environments, pages 385{396. Springer, 2007.

[25]Data security model for cloud computing by daiYuefa Wu bo,Gu Yaqiang,Zhang Quan,Tang chaojing .

[26] A. Juels and J. Burton S. Kaliski, “PORs: Proofs of Retrievability forLarge Files,” Proc. of CCS ’07, pp.

584–597, 2007.

[27] H. Shacham and B. Waters, “Compact Proofs of Retrievability,” Proc.of Asiacrypt ’08, Dec. 2008

[28] K. D. Bowers, A. Juels, and A. Oprea, “Proofs of Retrievability: Theory

and Implementation,” Cryptology ePrint Archive, Report 2008/175,2008, http://eprint.iacr.org/.

[29] K. D. Bowers, A. Juels, and A. Oprea, “HAIL: A High-Availability andIntegrity Layer for Cloud Storage,”

Cryptology ePrint Archive, Report 008/489, 2008, http://eprint.iacr.org/.

63

Page 64: DESSERTATION FINAL REPORT

[30] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson,and D. Song, “Provable Data

Possession at Untrusted Stores,” Proc. ofCCS ’07, pp. 598–609, 2007.

[31] G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, “Scalable andEfficient Provable Data Possession,”

Proc. of SecureComm ’08, pp. 1–10, 2008.

[32] R. Curtmola, O. Khan, R. Burns, and G. Ateniese, “MR-PDP: Multiple-Replica Provable Data Possession,”

Proc. of ICDCS ’08, pp. 411–420,2008.

[33] M. Lillibridge, S. Elnikety, A. Birrell, M. Burrows, and M. Isard,“A Cooperative Internet Backup Scheme,”

Proc. of the 2003 USENIXAnnual Technical Conference (General Track), pp. 29–41, 2003.

[34] D. L. G. Filho and P. S. L. M. Barreto, “Demonstrating Data Possessionand Uncheatable Data Transfer,”

Cryptology ePrint Archive, Report 006/150, 2006, http://eprint.iacr.org/.

[35] M. A. Shah, M. Baker, J. C. Mogul, and R. Swaminathan, “Auditing to Keep Online Storage Services

Honest,” Proc. 11th USENIX Workshop on Hot Topics in Operating Systems (HOTOS ’07), pp. 1–6, 2007.

[36] T. S. J. Schwarz and E. L. Miller, “Store, Forget, and Check: Using Algebraic Signatures to Check Remotely

Administered Storage,” Proc. of ICDCS ’06, pp. 12–12, 2006

.

64

Page 65: DESSERTATION FINAL REPORT

Bibliography and References

References and Bibliography

[1]http://www.apache.org./docs/current/hdfs_design.html[2]http;//www.ia.org/wiki/Reed–Solomon_error_correction[3]http;//www.ia.org/wiki/Universal_hashing[4]http://www.ia.org/wiki/Homomorphic_encryption[5]http://www.a.org/wiki/Byzantine_fault_tolerance[6]Cloud computingchallenges and related security issues by TraianAndrei[7]http://www.cloudswitch.com/page/making-cloud-computing-secure-for-the-enterprise. By Eellen Rubin[8]http;//www.technologyreview.com/searching an encrypted cloud by David Talbot[9]Ensuring Data Storage security in cloud computing by congwang,qianwang and kui ren,wenjing lou 978-1-4244-3876-1/09/$25.00 ©2009 IEEE[10] Ataxonomy and survey of cloud computing systems by Bhasker Prasad Rimal.Eunumi choi. Ian lumb. Rimal, B.P.; Eunmi Choi; Lumb, I.;INC, IMS and IDC, 2009. NCM '09. Fifth International Joint Conference on25-27 Aug. 2009 Page(s):44 - 51 Digital Object Identifier 10.1109/NCM.2009.218[11] http://www.enisa.europa.eu/[12] www.isaca.org

[13]on technical security issues in cloud computing by Meiko jenson, Jorg Schwenk Nils Gruschka,Luigi Lo Iacono Jensen, M.; Schwenk, J.; Gruschka, N.; Iacono, L.L.;Cloud Computing, 2009. CLOUD '09. IEEE International Conference on21-25 Sept. 2009 Page(s):109 - 116 Digital Object Identifier 10.1109/CLOUD.2009.60 [13]Privacy and security in cloud computing by tima mather ,kumaraswamy and shahed ali.page61-71[14]cloud security alliance guide[15]secure and efficient access to outsourced data by weichao wang Rodney owens Zhiweili and bharat bhargava CCSW'09, November 13, 2009, Chicago, Illinois, USA.Copyright 2009 ACM 978-1-60558-784-4/09/11[16] T. Chen, Y. Chung, and C. Tian. A novel key anagement scheme for dynamic access control in auser hierarchy. In IEEE Annual International computersftware and Applications Conference,pages 396{401, 2004.[17] H. Chien and J. Jan. New hierarchical assignment ithout public key cryptography. Computers &Security, 22(6):523{526, 2003.[18] C. Lin. Hierarchical key assignment without ublic-key cryptography. Computers & Security,20(7):612{619, 2001.[19] S. Zhong. A practical key management scheme for ccess control in a user hierarchy. Computers &Security, 21(8):750{759, 2002.[20] M. J. Atallah, M. Blanton, N. Fazio, and K. B. rikken. Dynamic and e_cient key management foraccess hierarchies. ACM Trans. Inf. Syst. Secur.,

65

Page 66: DESSERTATION FINAL REPORT

12(3):1{43, 2009.[21] E. Damiani, S. D. C. di Vimercati, S. Foresti,S. Jajodia, S. Paraboschi, and P. Samarati. Keymanagement for multi-user encrypted databases. InProceedings of the ACM workshop on Storage securityand survivability, pages 74{83, 2005.[22] S. D. C. di Vimercati, S. Foresti, S. Jajodia,S. Paraboschi, and P. Samarati. Over-encryption:management of access control evolution on outsourceddata. In Proceedings of the international conference onVery large data bases, pages 123{134, 2007.[23] S. D. C. di Vimercati, S. Foresti, S. Jajodia,S. Paraboschi, and P. Samarati. A data outsourcingarchitecture combining cryptography and accesscontrol. In Proceedings of the ACM workshop onComputer security architecture, pages 63{69, 2007.[24] E. Damiani, S. De Capitani di Vimercati, S. Foresti,S. Jajodia, S. Paraboschi, and P. Samarati. AnExperimental Evaluation of Multi-Key Strategies for daa Outsourcing, IFIP International Federation forInformation Processing, Volume 232, New Approachesfor Security, Privacy and Trust in ComplexEnvironments, pages 385{396. Springer, 2007.[25]Data security model for cloud computing by daiYuefa Wu bo,Gu Yaqiang,Zhang Quan,Tang chaojing .[26] A. Juels and J. Burton S. Kaliski, “PORs: Proofs of Retrievability forLarge Files,” Proc. of CCS ’07, pp. 584–597, 2007.[27] H. Shacham and B. Waters, “Compact Proofs of Retrievability,” Proc.of Asiacrypt ’08, Dec. 2008[28] K. D. Bowers, A. Juels, and A. Oprea, “Proofs of Retrievability: Theoryand Implementation,” Cryptology ePrint Archive, Report 2008/175,2008, http://eprint.iacr.org/.[29] K. D. Bowers, A. Juels, and A. Oprea, “HAIL: A High-Availability andIntegrity Layer for Cloud Storage,” Cryptology ePrint Archive, Report 008/489, 2008, http://eprint.iacr.org/.[30] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson,and D. Song, “Provable Data Possession at Untrusted Stores,” Proc. ofCCS ’07, pp. 598–609, 2007.[31] G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, “Scalable andEfficient Provable Data Possession,” Proc. of SecureComm ’08, pp. 1–10, 2008.[32] R. Curtmola, O. Khan, R. Burns, and G. Ateniese, “MR-PDP: Multiple-Replica Provable Data Possession,” Proc. of ICDCS ’08, pp. 411–420,2008.[33] M. Lillibridge, S. Elnikety, A. Birrell, M. Burrows, and M. Isard,“A Cooperative Internet Backup Scheme,” Proc. of the 2003 USENIXAnnual Technical Conference (General Track), pp. 29–41, 2003.[34] D. L. G. Filho and P. S. L. M. Barreto, “Demonstrating Data Possessionand Uncheatable Data Transfer,” Cryptology ePrint Archive, Report 006/150, 2006, http://eprint.iacr.org/.[35] M. A. Shah, M. Baker, J. C. Mogul, and R. Swaminathan, “Auditing to Keep Online Storage Services Honest,” Proc. 11th USENIX Workshop on Hot Topics in Operating Systems (HOTOS ’07), pp. 1–6, 2007.

66

Page 67: DESSERTATION FINAL REPORT

[36] T. S. J. Schwarz and E. L. Miller, “Store, Forget, and Check: Using Algebraic Signatures to Check Remotely Administered Storage,” Proc. of ICDCS ’06, pp. 12–12, 2006.

67