google bigtable - aalto university wiki · seminar on multimedia ... flexible, reliable distributed...

11
Google Bigtable Seminar on multimedia Henrik Kumlander 60426H, htkumlan(a)cc.hut.fi

Upload: vuongxuyen

Post on 13-Jul-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

Google Bigtable

Seminar on multimediaHenrik Kumlander

60426H, htkumlan(a)cc.hut.fi

Page 2: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

Intro

► Challenge: Google (and other party) applications needed a scalable, flexible, reliable distributed storage system for structured data

► Bigtable development began 2004, in use now

► Distributed storage system for managing structured data that is designed to scale to a very large size (petabytes on thousands of commodity servers)

► Many projects at Google store data in Bigtable, including web indexing, Google Earth, MapReduce, Blogger.com, Orkut (…over 60 apps or projects use Bigtable)

► Applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements

Page 3: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

General features 1/4

► Provides clients with a simple data model that supports dynamic control over data layout and format

► Data is indexed using row and column names that can be arbitrary strings

► Bigtable is a sparse, distributed, multidimensional sorted map

► The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes

► (row:string, column:string, time:int64) string

A slice of an example table that stores Web pages

Page 4: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

Timestamps = versioning

► Each cell in a Bigtable can contain multiple versions of the same data

► Versions are indexed by timestamp

► Bigtable uses 64-bit integers

► These can be assigned by Bigtable, or the application using Bigtable

Page 5: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

General features 2/4

► Built on [distributed] Google File System (GFS), and also:

► Uses Google’s Chubby Lock Service to synchronize accesses to shared resources. Chubby provides a limited but reliable distributed file system

► Supports execution of client-supplied scripts in the address spaces of the servers. The scripts are written in a language (dev: Google) for processing data called “Sawzall”

► At the moment, does not allow client Sawzall scripts to write back into Bigtable, but it does allow various forms of data transformation, altering based on arbitrary expressions, and summarization via a variety of operators

► MapReduce can use Bigtable as input source and output target

► Depends on a cluster management system for scheduling jobs, managing resources on shared machines, dealing with machine failures, and monitoring machine status

Page 6: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

General features 3/4► Each row range is called a tablet and is a unit for distribution and load

balancing

► Column’s used for access control (column entries usually of same type compressed together)

► Bigtable implementation has three major components: a library that is linked into every client, one master server, and many tablet servers

► Tables are optimized for GFS by being split into multiple tablets -segments of the table as split along a row chosen such that the tablet will be ~200 megabytes in size (tablets distributed)

Tablet structure, similiar to B+ tree

Page 7: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

General features 4/4

► Clients can group multiple column families together into a locality group. A separate table is generated for each locality group in each tablet

► Clients can control whether or not the tables for a locality group are compressed, and if so, which compression format is used

► Many clients use a two-pass custom compression scheme. The first pass uses Bentley and McIlroy's scheme, which compresses long common strings across a large window. The second pass uses a fast compression algorithm that looks for repetitions in a small 16 KB window of the data. Both compression passes are very fast—they encode at 100-200 MB/s, and decode at 400-1000 MB/s on modern machines

Page 8: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

Some examples of Bigtable use in various applications

Some statistics 1/2

Page 9: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

Some statistics 2/2

Performance statistic

• 1200 reads per second translates into approximately 75 MB/s of data read from GFS• Does not scale linearly• Worst case (slowest = random reads), increase of 100x in case of 500 tablet serversinstead of just one server. Usually however speed increase of almost 300x

Page 10: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

Future of Bigtable

► Support for secondary indices and infrastructure for building cross-data-center replicated Bigtables with multiple master replicas

► Deploying Bigtable as a service to product groups, so that individual groups do not need to maintain their own clusters. As service clusters scale, will need to deal with more resource-sharing issues within Bigtable itself

► Unusual interface of Bigtable has been difficult for users to adapt to

using it. New users sometimes uncertain of how to best use the Bigtable interface, particularly if they are accustomed to using relational databases that support general-purpose transactions

Page 11: Google Bigtable - Aalto University Wiki · Seminar on multimedia ... flexible, reliable distributed storage system for structured data ... Clients can control whether or not the tables

Sources (+further reading)

Research papers:

► Bigtable: A Distributed Storage System for Structured Data(Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber)

► The Google File System (Sanjay Ghemawat, Howard Gobioff, Shun-Tak

Leung)

► The Chubby Lock Service for Loosely-Coupled Distributed Systems (Mike Burrows)

Blog:

► Is the Relational Database Doomed? http://www.readwriteweb.com/enterprise/2009/02/is-the-relational-database-doomed.php