gluster: where weve been - a history

Gluster: Where We've Been

AB PeriasamyOffice of the CTO, Red Hat

John Mark WalkerGluster Community Guy

Topics

The Big Idea

Humble beginningsFrom Bangalore to Milpitas

Scale-out + Open source == WINNINGUser-space, no metadata server, stackable

Cloud and commoditization

A Data Explosion!

74% == Unstructured data annual growth

63,000 PB == Scale-out storage in 2015

40% == storage-related expense for cloud

44x == Unstructured data volume growth by 2020

Bengaluru Office

Conference Room

US Head Office

Bengaluru Office

Gluster Community Deployments

Gluster Production Deployments

What Can You Store? Media Docs, Photos, Video

VM Filesystem VM Disk Images

Big Data Log Files, RFID Data

Objects Long Tail Data

The big idea:Storage should be simple

Simple, scalable, low-cost

Add examples where complexity has been bad- EMC, Cisco, Brocade et al. certification made business out of complexity- if too complicated, doesn't scale

What is GlusterFS, Really?

Gluster is a unified, distributed storage systemDHT, stackable, POSIX, Swift, HDFS

Discuss approach how GlusterFS is unique and different from other approaches

- Lessons form GNU Hurd- user space distributed storage operating system- overcome some parts of the OS: implemented scheduler, POSIX locking, RDMA, MM, cf. JVM, python, etc.- no metadata separation

Phase 1: Lego Kit for StoragePeople who think that userspace filesystems are realistic for anything but toys are just misguided" Linus Torvalds

Goal: create a global namespace

If you have a bunch of files, should be as simple as an FTP server - in user space, required FUSE, POSIX translator, NAS protocol, cluster translator

volume testvol-posix type storage/posix option directory /media/datastore option volume-id 329e31c1-04cc-4386-8bb8-xxxxend-volume

volume testvol-access-control type features/access-control subvolumes testvol-posixend-volume

volume testvol-locks type features/locks subvolumes testvol-access-controlend-volume

volume testvol-io-threads type performance/io-threads subvolumes testvol-locksend-volume

Versions 1.x 2.x

Hand-crafted volume definition filesSee examples

Simple configuration files

Faster than tape? It's good!

Phase 2: Repeatability of Use Cases

Community-led

Learned from community Desired features

Usage profiles

All about scalable storage of unstructured data

Learned about missing features

Found the largest problem and wanted to solve it- patterns emerged- scalable unstructured data storage was the #1 problem people wanted to solve

Had a clearer idea where we wanted to go clear direction

GlusterFS 3.0: Putting it all together

Adding, removing features

Templates recipes for common use cases

Standalone NFS replacementActive-active replicated storageScalable, distributed storage

..

And then scalable, replicated distributed storage

+ other combos

GlusterFS 3.1 - 2010

Elasticity: add and remove volumes w/ glusterd

Automation: CLI, scriptable

Elastic features driven by cloud and virt usage- shared storage for virtual guests- flexible, self-service storage- elastic volume management became requirement- automated provisioning of storage w/ CLI

(native NFS server? Or 3.2?)

CLI Magic

$ gluster peer probe HOSTNAME$ gluster volume info$ gluster volume create VOLNAME [stripe COUNT] \ [replica COUNT] [transport tcp | rdma] BRICK $ gluster volume delete VOLNAME$ gluster volume add-brick VOLNAME NEW-BRICK ...$ gluster volume rebalance VOLNAME start

GlusterFS 3.2 - 2011

Native NFS server

Marker framework

Geo-replicationAsynchronous

Marker famework:- story of why it's necessary- backup of data in other locales- don't need entire snapshot- users wanted to continuous, unlimited replication- don't want sysadmin intervention on-demand- queries FS to find what files have changed- manages queue, telling rsync exactly which files to change

Inotify doesn't scale, if daemon crashes, stops tracking changes- would have to write journaling feature to maintain change queue

Geo-replication can work on high-latency, flaky networks

And now for something completely differentCommoditization and the changing economics of storage

Why we're winning

Simple EconomicsSimplicity, scalability, less cost

Multi-Tenant

Virtualized

Automated

Commoditized

Scale on Demand

In the Cloud

Scale Out

Open Source

Simplicity Bias

FC, FCoE, iSCSI HTTP, Sockets

Modified BSD OS Linux / User Space / C, Python & Java

Appliance based Application based

Scale-out Open Source is the winner

Thank you!

AB PeriasamyOffice of the CTO, Red [email protected]

John Mark WalkerGluster Community [email protected]

gluster: where weve been - a history

Technology