batch auditing for multiclient data in multicloud...

7
Batch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang, Xingming Sun, Yafeng Zhu, Peng Ji and Jin Wang Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Sci- ence & Technology, Nanjing, 210044, China School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing, 210044, China Abstract. Cloud storage enables users to outsource their data to cloud servers and enjoy the on-demand high-quality services. However, this new paradigm also introduces many challenges due to the security and integrity threats toward user’ outsourced data. Recently, various remote integrity auditing methods have been proposed, but most of them can only serve for the single cloud environ- ment or the individual auditing for each data file. In this paper, we develop an efficient auditing mechanism, which support batch auditing for multiple data files in multi-cloud environment. By utilizing the bilinear map, the proposed protocol achieves full stateless and transparent verification. By constructing a sequence-enforced Merkle Hash Tree, the proposed protocol can resist the re- place attack. In addition, our protocol protects the position information of the data blocks by generating fake data blocks to confuse the organizer. By compu- ting intermediate values of the verification on cloud servers, our method can greatly reduce the computing overhead of the auditor. The performance analysis proves the good efficiency of the proposed protocol. 1 Introduction Cloud storage service is an important service of cloud computing, which has become a new profit growth point by relieving individuals’ or enterprises’ burden for storage management and maintenance. By remotely storing data into the cloud, users can access their data via networks at anytime and from anywhere. However, many users are still hesitant to use this novel paradigm due to the security and integrity threats toward their outsourced data. This is because data loss could occur in any infrastruc- ture, whatever high degree of reliable measures the cloud service providers (CSPs) would take [1]. Moreover, the CSPs could be dishonest. They may hide data loss accidents to maintain the reputation, or even discard the data that has not been or rarely accessed to save the storage space and claim the data is still correctly stored in the cloud. Therefore, it is highly essential for the cloud users to check the integrity and availability of their cloud data. In order to address the issue above, various Provable Data Possession (PDP) pro- tocols have been proposed [2-4]. PDP is a probabilistic proof technique for checking Advanced Science and Technology Letters Vol.50 (CST 2014), pp.67-73 http://dx.doi.org/10.14257/astl.2014.50.11 ISSN: 2287-1233 ASTL Copyright © 2014 SERSC

Upload: phamthuan

Post on 05-May-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Batch Auditing for Multiclient Data in Multicloud …onlinepresent.org/proceedings/vol50_2014/11.pdfBatch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang,

Batch Auditing for Multiclient Data

in Multicloud Storage

Zhihua Xia, Xinhui Wang, Xingming Sun, Yafeng Zhu, Peng Ji and Jin Wang

Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Sci-

ence & Technology, Nanjing, 210044, China

School of Computer & Software, Nanjing University of Information Science & Technology,

Nanjing, 210044, China

Abstract. Cloud storage enables users to outsource their data to cloud servers

and enjoy the on-demand high-quality services. However, this new paradigm

also introduces many challenges due to the security and integrity threats toward

user’ outsourced data. Recently, various remote integrity auditing methods have

been proposed, but most of them can only serve for the single cloud environ-

ment or the individual auditing for each data file. In this paper, we develop an

efficient auditing mechanism, which support batch auditing for multiple data

files in multi-cloud environment. By utilizing the bilinear map, the proposed

protocol achieves full stateless and transparent verification. By constructing a

sequence-enforced Merkle Hash Tree, the proposed protocol can resist the re-

place attack. In addition, our protocol protects the position information of the

data blocks by generating fake data blocks to confuse the organizer. By compu-

ting intermediate values of the verification on cloud servers, our method can

greatly reduce the computing overhead of the auditor. The performance analysis

proves the good efficiency of the proposed protocol.

1 Introduction

Cloud storage service is an important service of cloud computing, which has become a

new profit growth point by relieving individuals’ or enterprises’ burden for storage

management and maintenance. By remotely storing data into the cloud, users can

access their data via networks at anytime and from anywhere. However, many users

are still hesitant to use this novel paradigm due to the security and integrity threats

toward their outsourced data. This is because data loss could occur in any infrastruc-

ture, whatever high degree of reliable measures the cloud service providers (CSPs)

would take [1]. Moreover, the CSPs could be dishonest. They may hide data loss

accidents to maintain the reputation, or even discard the data that has not been or

rarely accessed to save the storage space and claim the data is still correctly stored in

the cloud. Therefore, it is highly essential for the cloud users to check the integrity and

availability of their cloud data.

In order to address the issue above, various Provable Data Possession (PDP) pro-

tocols have been proposed [2-4]. PDP is a probabilistic proof technique for checking

Advanced Science and Technology Letters Vol.50 (CST 2014), pp.67-73

http://dx.doi.org/10.14257/astl.2014.50.11

ISSN: 2287-1233 ASTL Copyright © 2014 SERSC

Page 2: Batch Auditing for Multiclient Data in Multicloud …onlinepresent.org/proceedings/vol50_2014/11.pdfBatch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang,

the availability and integrity of outsourced data with randomly sampling a few file

blocks. Ateniese et al. [2] first defined PDP model with public verification. They

utilized RSA-based homomorphic linear authenticators (HLA) and suggested random-

ly sampling a few blocks of the file for verification. In order to support dynamic oper-

ations, Ateniese et al.[3] developed a partially dynamic version of scalable PDP model

based on symmetric key cryptography. After that, Erway et al. [4] presented a skip

list-based dynamics PDP model with fully data dynamic operation. Wang et al. also

proposed a dynamics PDP scheme based on combining Boneh–Lynn–Shacham signa-

ture (BLS)-based HLA with Merkle Hash Tree (MHT) structure [5, 6]. The scheme

supports both the public stateless verification and fully dynamic data update. Their

subsequent work [7] proposed a scheme supporting privacy-preserving public auditing,

which was also extended to enable batch auditing. In other related work, Juels and

Kaliski [8] describe a proofs of retrievability (POR) model, which not only can verify

data possession but also ensure retrievability of raw data files when abnormality is

detected.

Most of the above PDP schemes mainly address integrity verification issues at a

single CSP. As a more feasible application scenario, users may store their data in

multicloud with a distributed manner to reduce the threats of data integrity and availa-

bility [9]. In this scenario, multicloud is composed of multiple private or public clouds.

Each CSP has a different level of quality of service as well as a different cost associat-

ed with it. Hence, the users can store their data files on more than one CSP according

to the required level of security and their affordable budgets. Within multicloud, an

organization can offer and manage in-house and out-house resources [10, 11].

In this paper, we develop an efficient auditing mechanism, which support batch au-

diting for multiple data files in multi-cloud environment. By utilizing the bilinear map,

the proposed protocol can aggregate the verification task from different users to re-

duce the computing overhead of the auditor. By constructing a sequence-enforced

Merkle Hash Tree, the proposed protocol can resist the replace attack. In addition, our

protocol protects the position information of the data blocks by generating fake data

blocks to confuse the organizer, so as to achieve full stateless and transparent verifica-

tion.

2 Problem Statement

We consider a multicloud storage service model which is adopted by some previous

works [10, 11]. The model involves three different entities cloud users, multicloud and

third party auditor.

The cloud users have a number of data files to be stored in multiple clouds. They

have the authority to access and manipulate the stored data.

The multi-cloud consists of multiple Cloud Server Providers (CSPs). They provide

data storage service and have enough storage space and significant computation re-

sources. In this paper, to reduce the communication burden of verifier, one of CSPs is

designated as an organizer for auditing purpose. For example, the Zoho cloud in Fig. 1

is considered as an organizer. In our scheme, the organizer takes the responsibility to

Advanced Science and Technology Letters Vol.50 (CST 2014)

68 Copyright © 2014 SERSC

Page 3: Batch Auditing for Multiclient Data in Multicloud …onlinepresent.org/proceedings/vol50_2014/11.pdfBatch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang,

distribute the auditing challenge and aggregate the proof from multiple clouds. In this

paper, we suppose that the CSPs cannot communicate with each other apart from the

organizer for the auditing issues, and also, the verifier can only contact with the organ-

izer.

The Third Party Auditor (TPA) has a more powerful computation and communica-

tion ability than regular cloud users. In cloud storage system, none of cloud service

providers or users could be guaranteed to provide unbiased auditing result. Thus, third

party auditing is a natural choice. Moreover, by resorting to TPA, users can be re-

lieved from the burden of checking the integrity of outsource data.

3 The Batch Auditing CPDP Protocol

In the proposed protocol, we need to use a bilinear map group system, two hash func-

tions and a signature function. Let1 2

( , , , , , )T

S p g G G G e be a bilinear map group

system with generator g , where G1, G2 and GT are multiplicative cyclic groups of

prime order p , and1 2

:T

e G G G is a bilinear map. Let *

1( ) : {0 ,1}H G be a secure

map-to-point hash function, ( ) :T p

h G Z be another hash function which maps ele-

ment of T

G top

Z , and ( )S ig be the signature function. Let { }k

U be the set of users,

and{ }c

P be the set of CSPs. The number of CSPs is denoted as C .

3.1 Setup Phase

Each user runs setup phase of the protocol as follow:

Step1: (1 )K e y G e n . The k

th user

kU generates a signing key pair ( , )

k kssk sp k . Se-

lect a randomk p

Z , and compute k

kg

. Select s random ele-

ments,1 , 2 , 1

{ , , . . . , }k k k s

u u u G . To sum up, the secret key is ( , )k k k

sk ssk and the

public parameters are, 1

( , , , { } )k k k k j j s

p k s p k g u

.

Step2: ( , , )k k

T a g G en sk F P . The userk

U splits his filek

F into n s sec-

tors, , [1 , ], [1 , ]

{ }n s

k k i j i n l s pF m Z

, where the

thi block of user

kU block

,k im consists

of s sectors, ,1 , , 2 , ,

{ , , . . . , }k i k i k i s

m m m . The user com-

putes | | | | ( || )k

k k ssk kt n a m e n S ig n a m e n as the file tag for

kF , where

kn a m e is the file

name. Next, the userk

U constructs SMHT with a rootk

R , where the leave nodes are

an ordered hash values of data blocks, , [1 , ]

{ ( )}k i k i i n

T H m

as described in part 2.3. The

user also signsk

R with the private keyk

:

( ( )) ( ( )) k

kk k

S ig H R H R

. (1)

Advanced Science and Technology Letters Vol.50 (CST 2014)

Copyright © 2014 SERSC 69

Page 4: Batch Auditing for Multiclient Data in Multicloud …onlinepresent.org/proceedings/vol50_2014/11.pdfBatch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang,

For each data block,

( [1, ] )k i

m i n , the userk

U computes a data tag,k i

as

, ,

,, ,1( )

k i j ks m

k i k jk i juT

. (2)

Besides, the userk

U generates an additional random data block denoted as, 'k n

m ,

which is also divided into s sectors, ', [1 , ]

{ }k n j j s

m

. Then the corresponding tag is com-

puted as

, ',

1

, ' ,1( ( ) )

k n j ks m

C

k n k k jjH R u

. (3)

After all the parameters have been prepared, the userk

U distributes the data block

and tag pairs, ,

( , )k i k i

m to the corresponding cloud service providers, and sends the

random data block and its corresponding tag, ' , '

( , )k n k n

m to each cloud service provid-

er. In addition, the userk

U sends the parameters{S M H T , ( ( )) , }k

k kS ig H R t

to the or-

ganizer, and sends the public parametersk

p k to TPA. After data transmission, the user

asks TPA to conduct the confirmation auditing to make sure that their data is cor-

rectly stored on all the servers. Once confirmed, the user can choose to delete the

local copy of the data blocks apart from the secret key. By now, TPA could run the

sampling auditing periodically to check the data integrity for users.

3.2 Batch Audit Phase

Suppose that TPA process K auditing sessions of K distinct data files simultane-

ously. The data files are stored on C CSPs ({ } )c

P c C . The audit phase is executed as

follows:

Step1: ({ , } , )i i

G en P ro o f m ch a l . It is an interactive 5-move protocol among CSPs,

an organizer (O), and an Auditor (TPA). This process is described in the following.

1) Retrieve (O→TPA): After receiving the verification request, the organizer sends

the file tags[1 , ]

{ }k k K

t

to the TPA. The TPA

ers[1 , ]

{ }k k K

n a m e

and[1 , ]

{ }k k K

n

by using public keys[1 , ]

{ }k k K

s s p k

, and verifies all

the signatures. The verification quit by emitting FALSE if the file name verifica-

tion fails.

2) Challenge1 (O←TPA): If the file name is successful verified, the TPA generates

challenge index-coefficient message { ( , )}k

k i i IQ i v

for [1, ]k K ,

where [1, ]k k

I n specifies the sampled blocks that will be verified, andi p

v Z is

random element. TPA chooses a random elementp

Z and computes the set of

Advanced Science and Technology Letters Vol.50 (CST 2014)

70 Copyright © 2014 SERSC

Page 5: Batch Auditing for Multiclient Data in Multicloud …onlinepresent.org/proceedings/vol50_2014/11.pdfBatch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang,

challenge stamps[1 , ]

{ }k k k K

H

. To sum up, TPA generates the challenge as

follow, and sent it to the organizer.

[1, ][1, ]( ,{ } ).

k k k k Kk Kcha l Q H

(4)

3) Challenge2 (P←O): Upon receiving the challenge cha l from the TPA, the or-

ganizer forwards cha l to eachc

P .

4) Response1 (P→O): For eachk

Q , eachc

P picks out,

,{ ( , )}

k i cc k i m P k

Q i v Q . Here,

the symbol,k i c

m P means that the data block,k i

m is stored onc

P . Then, c

P cal-

culates the linear combination of specified data blocks for each file as

,

, , , ', ,

)( ,

,=

k

c ki

c k j k n j i k i j

Q

p

i v

m m Zv

, (5)

5) and generates data proofc

D P and tag proofc

T P as

, ,

,

,

1 1

, ' , , 1

1 ( , )

( , )

( )

c k j

k

k

i c

i

s K

c k j k

j k

K

v

c k n k i c

k v Qi

D P e u H

T P G

. (6)

Finally, c

P returns ( , )c c c

D P T P to the organizer.

6) Response2 (O→TPA): Upon receiving all the proofs from all the CSPs, the

organizer aggregates all of the proofc

into a response ( , )D P T P , which are

calculated as

, .c c

D P D P T P T P (7)

In addition, the organizer also provides the TPA with AAI, , [1, ]

{ }k

k i i I k K , which

are the siblings of the nodes on the path from the leaves, , [1, ]

{ }k

k i i I k KT

to the

rootk

R of the S M H T . Finally, the organizer responds TPA with proof.

,, [1, ] [1, ]

, ,k

k

k i ki I k K k K

P r S ig H R

(8)

Step2: ( , , )B a tch V erifyP ro o f p k ch a l P r . After receiving proof P r from the organ-

izer, TPA starts to verify to proof. First, for each [1, ]k K andk

i I , TPA first veri-

fies the position of, , [1 , ]

{ }k

k i fo r i I k kT

by checking if the

tion,

( ) 1k i

L E F T T i holds or not. Second, TPA constructs a verification root 'k

R by

Advanced Science and Technology Letters Vol.50 (CST 2014)

Copyright © 2014 SERSC 71

Page 6: Batch Auditing for Multiclient Data in Multicloud …onlinepresent.org/proceedings/vol50_2014/11.pdfBatch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang,

using, ,

{ , }k

k i k i i IT

for [1, ]k K , and verifies values of '

kR by checking Eq. (10).

Third, if the both authentication above succeeds, TPA verifies data integrity by check-

ing Eq. (11).

?

1 1

( ( ( )) , ) ( ( ') , )k

K K

k k k

k k

e s ig H R g e H R

(9)

?

,

1

( , ) ( ( ) , )i

K

v

k k i k

k i I

e T P g D P e H R T H

(10)

If the data blocks are not damaged by the cloud server, the equation (11) will be

proved to be true.

4 Security Analysis

Privacy-preserving property: Before computing and returning Response1, CSPs will

authenticate the challenge requests with the certificate issued by user on TPA’s public

key. Thus, only TPA can send the authentication request. Moreover, after receive the

final response from the organizer, TPA need to first validate the leaves of the SMHT

which is stored in the organizer. This guarantees only the organizer can compute the

final response. Besides, to protect the data privacy, we use Eq. (6) to process the line-

ar combination of specified data blocks, which is the same as the random mask tech-

nique.

Transparent verification property: This paper introduces an organizer, who is re-

sponsible for interacting with TPA. So, TPA cannot learn the details of data storage.

In this paper, it is considered that the organizer is also not trusted. In order to prevent

the organizers from getting the location of the data block, user generates a pair of

additional data-tag which will be sent to each CSP at setup phase of protocol. At the

audit phase, whichever data blocks are in the challenge request from the TPA, every

CSP will send a response to the organizer, which can conceal the details of data stor-

age from the organizer. Thus, the proposed batch auditing protocol conceals the stor-

age details of multiple CSPs from both TPA and the organizer.

5 Conclusions

In this paper, we explore collaborative integrity auditing issue of remote data stored in

the multiple clouds. A batch auditing PDP mechanism for multi-cloud environment is

proposed. Our construction enables TPA to perform multiple auditing tasks for multi-

ple data files in multiple clouds. Meanwhile, transparent verification, full stateless

verification and secure are also important objectives of the protocol. Utilizing the

sMHT construction, the proposed protocol achieves full stateless verification and

dynamic data operation with integrity assurance. The paper uses BLS-based homo-

Advanced Science and Technology Letters Vol.50 (CST 2014)

72 Copyright © 2014 SERSC

Page 7: Batch Auditing for Multiclient Data in Multicloud …onlinepresent.org/proceedings/vol50_2014/11.pdfBatch Auditing for Multiclient Data in Multicloud Storage Zhihua Xia, Xinhui Wang,

mophic authenticator to equip the verification protocol, which enable the TPA to

perform batch audits for multiple users based on the technique of bilinear aggregate

signature. In addition, by letting the cloud servers computing intermediate values of

the verification for the auditor, our method can greatly reduce the computing overhead

of the TPA. At last, security analysis shows that the proposed batch auditing protocol

is secure, and it also holds much better efficiency than the individual auditing with our

performance evaluation.

Acknowledgements. This work is supported by the NSFC (61232016, 61173141,

61173142, 61173136, 61103215, 61373132, 61373133), GYHY201206033,

201301030, 2013DFG12860, BC2013012 and PAPD fund.

References

1. Goodson, G. R. et al.: Efficient Byzantine-tolerant erasure-coded storage. Dependable

Systems and Networks, 2004 International Conference on, 2004, pp. 135-144.

2. Ateniese, G. et al.: Provable data possession at untrusted stores. Proceedings of the 14th

ACM conference on Computer and communications security, 2007, pp. 598-609.

3. Ateniese, G. et al.: Scalable and efficient provable data possession. Proceedings of the 4th

international conference on Security and privacy in communication netowrks, 2008, p. 9.

4. Erway, C. et al.: Dynamic provable data possession. Proceedings of the 16th ACM

conference on Computer and communications security, 2009, pp. 213-222.

5. Wang, Q. et al.: Enabling public auditability and data dynamics for storage security in

cloud computing. Parallel and Distributed Systems, IEEE Transactions on, vol. 22, pp.

847-859, 2011.

6. Boneh, D. et al.: Short signatures from the Weil pairing. Advances in Cryptology—

ASIACRYPT 2001, ed: Springer, 2001, pp. 514-532.

7. Wang, C. et al.: Privacy-preserving public auditing for secure cloud storage. 2013.

8. A. Juels and B. S. Kaliski Jr.: PORs: Proofs of retrievability for large files," in Proceedings

of the 14th ACM conference on Computer and communications security, 2007, pp. 584-

597.

9. AlZain, M. A. et al.: Cloud computing security: from single to multi-clouds. System

Science (HICSS), 2012 45th Hawaii International Conference on, 2012, pp. 5490-5499.

10. Zhu, Y. et al.: Collaborative integrity verification in hybrid clouds. Collaborative

Computing: Networking, Applications and Worksharing (CollaborateCom), 2011 7th

International Conference on, 2011, pp. 191-200.

11. Zhu, Y. et al.: Cooperative provable data possession for integrity verification in multi-cloud

storage. IEEE Transactions on Parallel and Distributed Systems, vol. 23(12), pp. 2231-

2244, 2012.

Advanced Science and Technology Letters Vol.50 (CST 2014)

Copyright © 2014 SERSC 73