student project for large installation administration€¦ · swift is the object storage system...

19
Secure Secure Secure Secure information information information information storage storage storage storage in in in in a a a a private private private private cloud cloud cloud cloud built built built built upon upon upon upon local local local local network network network network resources resources resources resources Student Project for Large Installation Administration Master of Science in System and Network Engineering Universiteit van Amsterdam Class of 2010-2011 Vic Ding ([email protected]) Damir Musulin ([email protected]) March 26, 2011 Version 1.0

Upload: others

Post on 28-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

SecureSecureSecureSecure informationinformationinformationinformation storagestoragestoragestorage

inininin aaaa privateprivateprivateprivate cloudcloudcloudcloud builtbuiltbuiltbuilt uponuponuponupon locallocallocallocal networknetworknetworknetwork resourcesresourcesresourcesresources

Student Project for Large Installation AdministrationMaster of Science in System and Network Engineering

Universiteit van Amsterdam

Class of 2010-2011

Vic Ding ([email protected])Damir Musulin ([email protected])

March 26, 2011Version 1.0

Page 2: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 2

ExecutiveExecutiveExecutiveExecutive summarysummarysummarysummary

This project investigates how to securely store data in a private cloud where users canlocally access data of each other since every user workstation is being used as storagenode.

Two approaches are illustrated on how to archive better security of data store. One istargeted distribution and the other one is on-the-fly cryptography. The first one focuses onstore data with certain permission only in designated groups of computers. And the laterone focuses on encrypt the data to prevent unauthorized access. These two methods canalso be applied to public clouds.

An open source implementation of private cloud storage, Swift1 from OpenStack2, is usedto build our test environment. It is picked due to its openness and being generic.

1 http://www.openstack.org/projects/storage/ storage solution from OpenStack2 http://www.openstack.org/ open source implementation of private cloud from NASA and RackSpace

Page 3: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 3

TableTableTableTable ofofofof ContentsContentsContentsContents

Executive summary................................................................................................................... 2Chapter 1 Introduction...............................................................................................................4Chapter 2 Research and approach............................................................................................. 5Chapter 3 Project environment.................................................................................................. 6

Chapter 3.1 What is Swift?.............................................................................................. 7Chapter 4 Targeted distribution................................................................................................. 8

Chapter 4.1 Setup for targeted distribution...................................................................... 9Chapter 4.2 Operation of targeted distribution.............................................................. 10

Chapter 5 Cryptography layer................................................................................................. 11Chapter 5.1 Swift concept and place to inject cryptography......................................... 13Chapter 5.2 Revised server.py....................................................................................... 15

Chapter 6 Conclusion.............................................................................................................. 16Chapter 7 Limitations & future study......................................................................................17Appendix A Cryptography methods in server.py for object storage node............................... 18References............................................................................................................................... 19

Page 4: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 4

ChapterChapterChapterChapter 1111 IntroductionIntroductionIntroductionIntroduction

With the evolve of the modern computer and network facilities, the power of these devicesis enhanced remarkably. However, most computing power of the workstations arenormally wasted when they are being idle or the CPU, memory and hard disk being onlyslightly used during the working hour. In order to utilize the already-paid-for equipmentsefficiently, a private cloud upon local network can be built to offload the burden of servers.It is not only comply with the trend of green IT but also can reduce cost for theorganizations.

Within the private cloud environment, data is stored across the whole network. It mightthen be the case that certain sensitive information get stored locally to a user who shouldnever be able to access it or user can go around the file system to access each other'sdata.

To address this issue, we carried out this project with the following research question:

HowHowHowHow cancancancan datadatadatadata bebebebe securelysecurelysecurelysecurely storedstoredstoredstored inininin aaaa privateprivateprivateprivate cloudcloudcloudcloud environmentenvironmentenvironmentenvironment wherewherewherewhere useruseruseruser hashashashaslocallocallocallocal accessaccessaccessaccess totototo it?it?it?it?

In this research two methods are discussed which can be used to mitigate the risk andsecure the data stored in the private cloud.

Page 5: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 5

ChapterChapterChapterChapter 2222 ResearchResearchResearchResearch andandandand approachapproachapproachapproach

Based on the research question, two approaches are being investigated. The twoapproaches are:

1. Targeted distribution - Store the privileged data only on the computer of theprivileged users.

2. Cryptography layer - Build extra layer to en/decrypt data upon read and write.

The first approach deals with the problem of distribution of data in the cloud/network.

If information is distributed in the cloud/network, there needs to be a way to control thedistribution in the cloud/network so that information is stored in locations that arepreferred.

An example could be that information is stored in a network section that is more securethan the general network.

The second approach deal with adding an cryptography to the private cloud software.Adding cryptography to the private cloud allows the uploader of information to encryptthe information when uploading the information to the cloud.

This allows the information to be stored anywhere on the network, because theinformation is encrypted and you will need the password to decrypt the information.

Once the user needs to retrieve the from the cloud, the user specify the informationneeded and gives the password to decrypt the information.

With these two approaches there is a possibility two answer the research question in twodifferent manners.

The first possibility is securing data by storing it on a different part of the network thusdenying access to ordinary users .

The second possibility is to encrypt the data by adding an encryption layer thus the localuser cannot access the information.

Page 6: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 6

ChapterChapterChapterChapter 3333 ProjectProjectProjectProject environmentenvironmentenvironmentenvironment

The research is focused on security in the private cloud. To do the research there is aneed for private cloud software, so that the research approaches can be implemented onthe private cloud software, when there is a possibility to do so.

Because there are multiple implementations of private cloud software, the choice for aspecific implementation of private cloud software is based on software with a liberallicense, like GPL,BSD and the Apache license.

Another criteria is that there is an active community so that there is support when thereare problems with the private cloud software.

For the project environment OpenStack software was chosen because of the liberalApache license3 and the strong community, with big companies/institutions likeNASA5(National Aeronautics and Space Administration) and Cisco6 supporting theproject.

For the research other private cloud software could be used but the liberal license and thestrong community convinced us to use OpenStack in our project environment.

3 http://www.openstack.org/5 http://nebula.nasa.gov/blog/2010/jul/nebula-technology-to-play-key-role-in-new-open-sou/6 http://blogs.cisco.com/news/cisco-joins-openstack-community/

Page 7: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 7

ChapterChapterChapterChapter 3.13.13.13.1WhatWhatWhatWhat isisisis Swift?Swift?Swift?Swift?

Swift is the object storage system for the OpenStack private cloud. OpenStack consists ofindividual components that together make up the OpenStack object storage.

The components that make OpenStack object storage are:

1. OpenStack Swift Object storage nodes2. OpenStack authentication system3. OpenStack proxy system

Figure 1 OpenStack Object Storage7

7 http://docs.openstack.org/openstack-object-storage/admin/content/ch03s02.html Figure 1 originated from thedocumentation webiste of Swift.

Page 8: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 8

ChapterChapterChapterChapter 4444 TargetedTargetedTargetedTargeted distributiondistributiondistributiondistribution

The first approach to the security of data in the could be targeted distribution.

With targeted distribution the user gets the possibility to decide to which part of thenetwork the user data is distributed to.

The idea behind targeted distribution is that a network consists of different segmentedparts, with different security levels applied to the network segments. If the user has datathat is mission critical then the user can decide to distribute the data over certain part ofthe network which is more secure that the general network.

An example would be that information from company management is only distributedamong the computers that meet a certain security standard of the company.

Targeted distribution gives the possibility to the board of directors where to store the dataand meanwhile maintaining high availability.

Figure 2:logical network layout for targeted distribution

In figure2 there are five logical network segments . If the company management wants tostore mission critical data then the data can only be stored on computers that have asecurity standards that meet the needs of the data storage requirements of the companymanagement. In this case the management can thrust their network and the networksegment with elevated security standard. The management can target these computers tostore the data of the management, this is the idea of targeted distribution.

Page 9: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 9

ChapterChapterChapterChapter 4.14.14.14.1 SetupSetupSetupSetup forforforfor targetedtargetedtargetedtargeted distributiondistributiondistributiondistribution

To generate a network layout like figure2 for targeted distribution , the Swift object storagesystem needs to be modified to be aware of the different network segments.A way to make the object storage system aware of the different network segments is theuse of LDAP.With the help of LDAP server and modification on Swift , it is possible to let the storagecloud be aware of different segments of the network.

An issue with targeted distribution is that the scalability can be problematic.Swift storage nodes are pre-defined in the initial state of setup. In a later stage, addition ofstorage nodes to a network segment will require that every node in a network segment isredefined, to become aware of the addition of storage nodes.

When adding nodes to a network segment, the configuration in LDAP needs to bemanually changed to be able to adopt the changes.

Page 10: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 10

ChapterChapterChapterChapter 4.24.24.24.2 OperationOperationOperationOperation ofofofof targetedtargetedtargetedtargeted distributiondistributiondistributiondistribution

Figure 3: LDAP operations

In figure 3 node 12 requests information from the LDAP server. When node 12 receivesthe information from the LDAP server, node 12 can request all nodes in the same segmentto become a segment with other nodes in the same segment.

Page 11: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 11

ChapterChapterChapterChapter 5555 CryptographyCryptographyCryptographyCryptography layerlayerlayerlayer

Cryptography is a common practice in securing data storage nowadays. It ensures thateven the data is accessible to unauthorized person, it can still be kept safe. In our project,in order to leave the user environment as simple and independent as possible, we decideto implement the cryptography on the server side. To be more specific, we revised thecode of storage node to let it encrypt and decrypt data during the upload and download offiles.

When user upload a file, the file contains original data without encryption. In the currentversion of Swift, user with local access can operate on it if he can locate the file on thelocal file system. We revised the code in the way that it encrypts the file during the uploadtime with Advanced Encryption Standard (AES)8 using the Python Crypto library9. Infigure4, we can see the same text file before and after the encryption.

Figure 4

The original file size is 2 bytes, and become 16 bytes after encryption. The file is paddedby the implementation of AES, and 16 bytes is the minimum length of the file, due to theminimum key size being used.

In figure5, we can see the encryption of a picture.

Figure 5

8 http://en.wikipedia.org/wiki/Advanced_Encryption_Standard Advanced Encryption Standard9 https://launchpad.net/pycryptopp - Python crypto library project website

Page 12: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 12

It shows clearly that users can upload any type of files, it will then be secured byencryption with the private key of the user. When user wants to download it, it will beautomatically decrypted if the user can provide the same correct key. Hence, the data canbe stored securely.

The procedure will be elaborated in detail in next chapter.

Page 13: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 13

ChapterChapterChapterChapter 5.15.15.15.1 SwiftSwiftSwiftSwift conceptconceptconceptconcept andandandand placeplaceplaceplace totototo injectinjectinjectinject cryptographycryptographycryptographycryptography

First of all, according to the concept of Swift, everything must be put in a ring. Everythinghere includes, account, container and object. For object, there is an object ring. Since weare going to encrypt only the objects, the real file, so we choose to revised the objectserver, server.py. It implement the storage nodes which participate in the ring operationFigure 610.

The idea behind this file is simple. It reassembles a Web Server Gateway Interface(WSGI)11 to handle all the requests from users. When user submits a request, it will beconverted to a HTTPS request, handled by the WSGI then communicated internally withthe storage nodes where the real file is being stored.

The WSGI class defined in server.py for objects is the single point where both downloadand upload are handled. It is the ideal place to inject the cryptography codes without overaltering the original Swift system.

Figure 6

10 http://docs.openstack.org/openstack-object-storage/admin/content/ch03s02.html Origin of figure 6 in theinstallation document of Swift on its official website11 http://wsgi.org/wsgi/What_is_WSGI Web site of WSGI (Web Server Gateway Interface) which is a Pythonstandard PEP33 http://www.python.org/dev/peps/pep-0333/

Page 14: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 14

However, there is no central point of editing. As illustrated from the above Figure 6, eachstorage node is purposely separated from each other to maintain high scalability. Meaningto say that the WSGI is implemented on each of the server. The revised server file has tobe propagated to every server. In order to save manual labour, automation tool, likeCFengine12 can be used to mitigate this problem.

12 http://www.cfengine.org/ CFengine is a powerful data center configuration automation tool

Page 15: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 15

ChapterChapterChapterChapter 5.25.25.25.2 RevisedRevisedRevisedRevised server.pyserver.pyserver.pyserver.py

Below is the code snippet we inserted into the WSGI implementation.

def encrypt_file(chunk, key, IV, blocksize=64*1024):

def decrypt_file(chunk, key, IV, blocksize=64*1024):

These two functions serve the purpose of encryption and decryption. The hashed key ofthe operator's account will be used as key. The reason of hashing the key besidesimproved security is that the Python library we use is quite low level, it is efficient butrequires pre-padded fixed length key of either 16, 24 or 32 bytes. And the product of sucha Secure Hash Algorithm 256 (SHA256)13 is just 32 bytes long14.

The Initial Vector (IV) could be 16 bytes random string. We can store the IV as the first 16bytes of the encrypted file, and let the server read from it each time upon downloadingrequest. The quality of IV has high impact on the quality of cryptography and does notexpose information to cracker by itself. That is why we can store it together with theencrypted file and have it randomly generated each time.

The block size is the size of the data chunk we are going to operate on each time. Thesize of the network buffer is 65536 (64 * 1024). We set here the chunk size to the same toavoid unnecessary split or merge of the chunks which will save the computational powerof computer and hence leverage the efficiency.

The file level operation is handled by the Swift system. What we have to do is only passthe encrypted chunk to the handler or get the encrypted chunk from the handler.

The whole revised code can be found in Appendix A.

13 http://en.wikipedia.org/wiki/SHA-2 Secure Hash Algorithm14 http://en.wikipedia.org/wiki/SHA-2 In section Comparism of SHA functions, there it indicates the length ofproduct of SHA256

Page 16: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 16

ChapterChapterChapterChapter 6666 ConclusionConclusionConclusionConclusion

The conducted research concludes a positive answer to the research question, so that byusing both of the presented methods, data can be secured in the private cloud where userhas local access rights. However, there are advantages and disadvantages concerningeach of them.

With targeted distribution, the number of computers which can be used to store data islimited by the fact that some of them fall in a lower security group. Hence the utilization ofthe resource is again suboptimal. But the security level can be leveraged largely as onecan argue that there is no way to break the system if the cracker do not even get thesystem. The highly secured data are only accessible to the ones who should have theright to access it.

With the cryptography layer, the number of computers which can be used to store data isonly limited by the total amount of available computers in the organization. In this case, itis making optimal usage of the available resources. However, the cryptography canimpulse large performance penalty especially in the case of encrypting large amount ofdata.

Though there are still limitations and disadvantages to each of the approach, they can bepossibly covered by the future studies recommended in the later chapter on limitationsand future studies.

During the project, investigation on the CAP (Consistency, Availability and Performance)of the storage nodes before and after modification is also being carried out.

From the overview15 of Swift, it says that the consistency is sacrificed to archive betteravailability and performance. The same symptom can be observed after our modificationto the software package.

� The consistency of the files is inheritably low. In the real use cases, even afterdays, there are still different versions of the same file on different storage nodeseven when the synchronous timer is set to every one hour.

� The availability is generally high in case of cryptography approach. But it is highlydependent on the amount of machines in certain group when targeteddistribution is used.

� The performance is depending on the active machines which are handling therequest.

15 http://programmerthoughts.com/openstack/swift-openstack-object-storage-overview/

Page 17: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 17

ChapterChapterChapterChapter 7777 LimitationsLimitationsLimitationsLimitations &&&& futurefuturefuturefuture studystudystudystudy

There are two limitations in our project:

1. There is no implementation of targeted distribution. Instead, only a proof ofconcept is illustrated in the report.

2. Storage node is accessible from other networks.

It takes long time to implement such a targeted distribution in Swift. The distribution of thereplications is controlled by the combination of the corresponding ring file and rsync whichlies on the bottom layer of the software package. In order to change the behaviour ofdistribution, it will require modification to both the ring file and rsync. The time expected toarchive such modification clearly falls out of our study scope. We only made a proof ofconcept as a show case instead of making a working system.

In the manual16 of Swift, it mentioned that the storage nodes contains the data and shouldbe put into a separate network which is only accessible to local network. However, in oursettings it is not the case. We ran seven virtual machines on two lab servers. One forproxy server, one for authentication server and five storage nodes. We have to simulatethe situation that the storage nodes are accessible to local users. To be able to archivethat effect, we use each other storage nodes as the access terminal to access the localdata of other nodes. To be able to let us get into the "terminals", we have to enable theaccess to outside.

These are the two limitations we faced and dealt during the study. We would like torecommend for limitation one a future study, implementation of targeted distribution. It isinteresting to see in reality how the behaviour of the distribution can be controlled and thereplications are arranged in the way that comply to organizational security policies.

In addition to that, a very interesting and important further step can be taken to extend thisstudy and make it more useful. We would like to see what is the possibility and effect topull away the authentication server, where the keys are located, to customer side. Nowmany organizations' concern is if they use external storage their keys are out of theircontrol since the authentication or key server will be located on the supplier side. If thefurther study can show that the keys can be kept within the organization and onlycommunicated with the service provider in a secured way, it will certainly enhance theconfidence level of customers, and can then make better use of the technology.

16 http://docs.openstack.org/openstack-object-storage/admin/content/ch03s02.html

Page 18: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 18

AppendixAppendixAppendixAppendix AAAA CryptographyCryptographyCryptographyCryptography methodsmethodsmethodsmethods inininin server.pyserver.pyserver.pyserver.py forforforfor

objectobjectobjectobject storagestoragestoragestorage nodenodenodenode

def encrypt_file(chunk, key, iv, blocksize=64*1024):

iv = ''.join(chr(0) for i in range(16))encryptor = AES.new(key, AES.MODE_CBC, iv)

if len(chunk) % 16 != 0:chunk += ' ' * (16 - len(chunk) % 16)

return encryptor.encrypt(chunk)

def decrypt_file(chunk, key, iv, blocksize=64*1024):

iv = ''.join(chr(0) for i in range(16))

decryptor = AES.new(key, AES.MODE_CBC, iv)return decryptor.decrypt(chunk)

Page 19: Student Project for Large Installation Administration€¦ · Swift is the object storage system for the OpenStack private cloud. OpenStack consists of ... to implement the cryptography

Secure information storage in a private cloud built upon local network resources

2011-3-26 Version 1.0 19

ReferencesReferencesReferencesReferences

1. http://www.openstack.org/projects/storage/ storage solution from OpenStack

2. http://www.openstack.org/ open source implementation of private cloud from NASAand RackSpace

3. http://www.openstack.org/

4. http://nebula.nasa.gov/blog/2010/jul/nebula-technology-to-play-key-role-in-new-open-sou/

5. http://blogs.cisco.com/news/cisco-joins-openstack-community/

6. http://docs.openstack.org/openstack-object-storage/admin/content/ch03s02.htmlFigure 1 originated from the documentation webiste of Swift.

7. http://en.wikipedia.org/wiki/Advanced_Encryption_Standard Advanced EncryptionStandard

8. https://launchpad.net/pycryptopp - Python crypto library project website

9. http://docs.openstack.org/openstack-object-storage/admin/content/ch03s02.htmlOrigin of figure 6 in the installation document of Swift on its official website

10. http://wsgi.org/wsgi/What_is_WSGI Web site of WSGI (Web Server GatewayInterface) which is a Python standard PEP33http://www.python.org/dev/peps/pep-0333/

11. http://www.cfengine.org/ CFengine is a powerful data center configuration automationtool

12. http://en.wikipedia.org/wiki/SHA-2 Secure Hash Algorithm

13. http://en.wikipedia.org/wiki/SHA-2 In section Comparism of SHA functions, there itindicates the length of product of SHA256

14. http://programmerthoughts.com/openstack/swift-openstack-object-storage-overview/

15. http://docs.openstack.org/openstack-object-storage/admin/content/ch03s02.html