scalable object storage with apache cloudstack and apache hadoop
DESCRIPTION
Object Storage (like AWS S3) in the cloud is a key enabler of scalability and reliability in Cloud Computing. Apache CloudStack. We will discuss how CloudStack integrates Object Storage solutions and discuss specifically how HDFS can provide the storage engine for the Object Storage component.TRANSCRIPT
![Page 1: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/1.jpg)
Scalable Object Storage with Apache CloudStack and
Apache HadoopApril 30 2013
Chiradeep Vittal@chiradeep
![Page 2: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/2.jpg)
Agenda• What is CloudStack• Object Storage for IAAS• Current Architecture and Limitations• Requirements for Object Storage• Object Storage integrations in
CloudStack• HDFS for Object Storage• Future directions
![Page 3: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/3.jpg)
•History• Incubating in the Apache
Software Foundation since April 2012
•Open Source since May 2010
• In production since 2009– Turnkey platform for delivering
IaaS clouds– Full featured GUI, end-user API
and admin API
Apache CloudStack
Build your cloud the way the world’s most
successful clouds are built
![Page 4: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/4.jpg)
How did Amazon build its cloud?
Commodity Servers
Commodity StorageNetworking
Open Source Xen Hypervisor
Amazon Orchestration Software
AWS API (EC2, S3, …)
Amazon eCommerce Platform
![Page 5: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/5.jpg)
How can YOU build a cloud?
Servers StorageNetworking
Open Source Xen Hypervisor
Amazon Orchestration Software
AWS API (EC2, S3, …)
Amazon eCommerce Platform
Hypervisor (Xen/KVM/VMW/)
CloudStack Orchestration Software
Optional Portal
CloudStack or AWS API
![Page 6: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/6.jpg)
Secondary StorageImage
L3/L2 core
DC Edge
End users
Pod Pod Pod Pod
Zone Architecture
Pod
Access Sw
MySQL
CloudStack
Admin/User API
Primary StorageNFS/ISCSI/FC
Hypervisor (Xen/VMWare/KVM)
VM
VM
Snapshot
Snapshot
Image
Disk Disk
VM
![Page 7: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/7.jpg)
Cloud-Style Workloads• Low cost
– Standardized, cookie cutter infrastructure– Highly automated and efficient
• Application owns availability– At scale everything breaks– Focus on MTTR instead of MTBF
![Page 8: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/8.jpg)
Secondary StorageImage
L3/L2 core
DC Edge
Pod Pod Pod Pod
At scale…everything breaks
Pod
Access Sw
Primary StorageNFS/ISCSI/FC
Hypervisor (Xen/VMWare/KVM)
VM
VM
Snapshot
Snapshot
Image
Disk Disk
VM
![Page 9: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/9.jpg)
Region “West”
Zone “West-Alpha”Zone “West-Beta”
Zone “West-Gamma”
Zone “West-Delta”
Low Latency Backbone(e.g., SONET ring)
Regions and zones
![Page 10: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/10.jpg)
Region “East”
Region “South”
Internet
Geographic separation
Region “West”
Low Latency
![Page 11: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/11.jpg)
Secondary Storage in CloudStack 4.0
• NFS server default– can be mounted by hypervisor– Easy to obtain, set up and operate
• Problems with NFS:– Scale: max limits of file systems
• Solution: CloudStack can manage multiple NFS stores (+ complexity)
– Performance• N hypervisors : 1 storage CPU / 1 network link
– Wide area suitability for cross-region storage• Chatty protocol
– Lack of replication
![Page 12: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/12.jpg)
Object Storage Technology
Region “West”
Zone “West-Alpha”Zone “West-Beta”
Zone “West-Gamma”
Zone “West-Delta”
Object Storage in a region
• Replication• Audit• Repair• Maintenanc
e
![Page 13: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/13.jpg)
Region “West”
Object Storage enables reliability
![Page 14: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/14.jpg)
Object Storage Technology
Region “West”
Object Storage also enables other applications
Object StoreAPI Servers
• DropBox• Static
Content• Archival
![Page 15: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/15.jpg)
Object Storage characteristics
• Highly reliable and durable– 99.9 % availability for AWS S3– 99.999999999 % durability
• Massive scale– 1.3 trillion objects stored across 7 AWS regions [Nov 2012 figures]– Throughput: 830,000 requests per second
• Immutable objects– Objects cannot be modified, only deleted
• Simple API– PUT/POST objects, GET objects, DELETE objects– No seek / no mutation / no POSIX API
• Flat namespace– Everything stored in buckets.– Bucket names are unique– Buckets can only contain objects, not other buckets
• Cheap and getting cheaper
![Page 16: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/16.jpg)
CloudStack S3 API Server
Object Storage Technology
S3 API Servers
MySQL
![Page 17: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/17.jpg)
CloudStack S3 API Server• Understands AWS S3 REST-style and SOAP API• Pluggable backend
– Backend storage needs to map simple calls to their API• E.g., createContainer, saveObject, loadObject
– Default backend is a POSIX filesystem– Backend with Caringo Object Store (commercial
vendor) available– HDFS backend also available
• MySQL storage– Bucket -> object mapping– ACLs, bucket policies
![Page 18: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/18.jpg)
Object Store Integration into CloudStack
• For images and snapshots• Replacement for NFS secondary storage
OrAugmentation for NFS secondary storage
• Integrations available with– Riak CS– Openstack Swift
• New in 4.2 (upcoming):– Framework for integrating storage providers
![Page 19: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/19.jpg)
What do we want to build ?• Open source, ASL licensed object
storage• Scales to at least 1 billion objects• Reliability and durability on par with
S3• S3 API (or similar, e.g., Google
Storage)• Tooling around maintenance and
operation, specific to object storage
![Page 20: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/20.jpg)
The following slides are a design discussion
![Page 21: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/21.jpg)
Architecture of Scalable Object Storage
API Servers
Auth Servers
Object Servers Replicators/Auditors
Object Lookup Servers
![Page 22: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/22.jpg)
Why HDFS• ASF Project (Apache Hadoop)• Immutable objects, replication• Reliability, scale and performance
– 200 million objects in 1 cluster [Facebook]
– 100 PB in 1 cluster [Facebook]• Simple operation
– Just add data nodes
![Page 23: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/23.jpg)
HDFS-based Object Storage
S3 API Servers
S3 Auth Servers
Data nodes
Namenode pair
HDFS API
![Page 24: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/24.jpg)
BUT• Name Node Scalability
– 150 bytes RAM / block– GC issues
• Name Node SPOF– Being addressed in the community✔
• Cross-zone replication– Rack-awareness placement ✔– What if the zones are spread a little further
apart?• Storage for object metadata
– ACLs, policies, timers
![Page 25: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/25.jpg)
Name Node scalability• 1 billion objects = 3 billion blocks
(chunks)– Average of 5 MB/object = 5 PB (actual),
15 PB (raw)– 450 GB of RAM per Name Node
• 150b x 3 x 10^9 – 16 TB / node => 1000 Data nodes
• Requires Name Node federation ?• Or an approach like HAR files
![Page 26: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/26.jpg)
Name Node Federation
Extension: Federated NameNodes are HA pairs
![Page 27: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/27.jpg)
Federation issues• HA for name nodes• Namespace shards
– Map object -> name node• Requires another scalable key-value store
– HBase?• Rebalancing between name nodes
![Page 28: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/28.jpg)
Replication over lossy/slower linksA. Asynchronous replication
– Use distcp to replicate between clusters– 6 copies vs. 3– Master/Slave relationship
• Possibility of loss of data during failover• Need coordination logic outside of HDFS
B. Synchronous replication– API server writes to 2 clusters and acks only
when both writes are successful– Availability compromised when one zone is
down
![Page 29: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/29.jpg)
CAP TheoremConsistency or Availability during
partitionMany nuances
![Page 30: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/30.jpg)
Storage for object metadataA. Store it in HDFS along with the
object– Reads are expensive (e.g., to check
ACL)– Mutable data, needs layer over HDFS
B. Use another storage system (e.g. HBase)
– Name node federation also requires this.
C. Modify Name Node to store metadata
– High performance– Not extensible
![Page 31: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/31.jpg)
Object store on HDFS Future• Viable for small-sized deployments
– Up to 100-200 million objects– Datacenters close together
• Larger deployments needs development– No effort ongoing at this time
![Page 32: Scalable Object Storage with Apache CloudStack and Apache Hadoop](https://reader033.vdocuments.site/reader033/viewer/2022050904/54535d2fb1af9f88228b45a8/html5/thumbnails/32.jpg)
Conclusion• CloudStack needs object storage for
“cloud-style” workloads• Object Storage is not easy• HDFS comes close but not close
enough• Join the community!