hbasecon 2012 | hbase security for the enterprise - andrew purtell, trend micro

38
Andrew Purtell, Trend Micro On behalf of the Trend Hadoop Group [email protected] HBase Security for the Enterprise

Upload: cloudera-inc

Post on 20-Aug-2015

3.035 views

Category:

Technology


0 download

TRANSCRIPT

Andrew Purtell, Trend MicroOn behalf of the Trend Hadoop [email protected]

HBase Security for the Enterprise

Agenda

• Who we are

• Motivation

• Use Cases

• Implementation

• Experience

• Quickstart Tutorial

Introduction

Trend Micro

Headquartered: Tokyo, Japan Founded:

LA 1988

• Technology innovator and top ranked security solutions provider

• 4,000+ employees worldwide

Trend Micro Smart Protection Network

DataPlatform

Threat Collection

• Customers• Partners• TrendLabs Research,

Service & Support• Samples• Submissions• Honeypots• Web Crawling• Feedback Loops• Behavioral Analysis

Partners

• ISPs• Routers• Etc.

Endpoint

Gateway

SaaS

Cloud

Management

Off Network

Threats

WEBREPUTATION

EMAILREPUTATION

FILEREPUTATION

Messaging

• Information integration is our advantage

Trend Hadoop Group

• We curate and support a complete internal distribution• We act within the ASF community processes on behalf of

internal stakeholders, and are ASF evangelists

Core

HBase

ZooKeeper

HDFS

MapReduce

Pig

Sqoop

Supported Not Supported, Monitoring

AvroMahout

Hive

Oozie

Giraph

Flume

Cascading

Solr

Gora

Motivation

Our Challenges

• As we grow our business we see the network effects of our customers' interactions with the Internet and each other

• This is a volume,variety, and velocityproblem

Why HBase?

• For our Hadoop basedapplications, if we wereforced to use MR forevery operation, itwould not be useful• Fortunately, HBase provides low latencyrandom access tovery large data tablesand first class Hadoopplatform integration

But...

• Hadoop, for us, is the centerpiece of a data management consolidation strategy

• (Prior to release 0.92) HBase did not have intrinsic access control facilities

• Why do we care? Provenance, fault isolation, data sensitivity, auditable controls, ...

Our Solution

• Use HBase where appropriate• Build in the basic access control features we need

(added in 0.92, evolving in 0.94+)• Do so with a community sanctioned approach• As a byproduct of this work, we have Coprocessors,

separately interesting

Use Cases

Meta

• Our meta use case: Data integration, storage and service consolidation

Yesterday: Data islands

Today: “Dataneighborhood”

Application Fault Isolation

• Multitenant cluster, multiple application dev teams• Need to strongly authenticate users to all system

components: HDFS, HBase, ZooKeeper• Rogue users cannot subvert authentication• Allow and enforce restrictive permissions on internal

application state: files, tables/CFs, znodes

Private Table (Default case)

• Strongly authenticate users to all system components• Assign ownership when a table is created• Allow only the owner full access to table resources• Deny all others• (Optional) Privacy on the wire with encrypted RPC

• Internal application state• Applications under development, proofs of concept

Sensitive Column Families in Shared Tables

• Strongly authenticate users to all system components• Grant read or read-write permissions to some CFs• Restrict access to one or more other CFs only to owner• Requires ACLs at per-CF granularity• Default deny to help avoid policy mistakes

• Domain Reputation Repository (DRR)• Tracking and logging system (TLS), like Google's Dapper

Read-only Access for Ad Hoc Query

• Strongly authenticate users to all system components• Need to supply HBase delegation tokens to MR• Grant write permissions to data ingress and analytic

pipeline processes• Grant read only permissions for ad hoc uses, such as Pig

jobs• Default deny to help avoid policy mistakes

• Knowledge discovery via ad hoc query (Pig)

Implementation

Goals and Non-Goals

Goals

• Satisfy use cases• Use what Secure Hadoop Core provides as much as

possible• Minimally invasive to core code

Non-Goals

• Row-level or per value (cell)• Complex policy, full role based access control• Push down of file ownership to HDFS

Coprocessors

• Inspired by Bigtable coprocessors, hinted at like the Higgs Boson in Jeff Dean's LADIS '09 keynote talk

• Dynamically installed code that runs at each region in the RegionServers, loaded on a per-table basis:

Observers: Like database triggers, provide event-based hooks for interacting with normal operations

Endpoints: Like stored procedures, custom RPC methods called explicitly with parameters

• A high-level call interface for clients: Calls addressed to rows or ranges of rows are mapped to data location and parallelized by client library

• Access checking is done by an Observer• New security APIs implemented as Endpoints

Authentication

• Built on Secure Hadoop

Client authentication via Kerberos, a trusted third party

Secure RPC based on SASL

• SASL can negotiate encryption and/or message integrity verification on a per connection basis

• Make RPC extensible and pluggable, add a SecureRpcEngine option

• Support DIGEST-MD5 authentication, allowing Hadoop delegation token use for MapReduce

TokenProvider, a Coprocessor that provides and verifies HBase delegation tokens, and manages shared secrets on the cluster

Authorization – AccessController

• AccessController: A Coprocessor that manages access control lists

• Simple and familiar permissions model: READ, WRITE, CREATE, ADMIN

• Permissions grantable at table, column family, and column qualifier granularity

• Supports user and group based assignment• The Hadoop group mapping service can model

application roles as groups

Authorization – AccessController

Authorization – Secure ZooKeeper

• ZooKeeper plays a critical role in HBase cluster operations and in the security implementation; needs strong security or it becomes a weak point

• Kerberos-based client authentication• Znode ACLs enforce SASL authenticated access for sensitive data

Audit

• Simple audit log via Log4J• Still need to work out a structured format for audit log

messages

Two Implementation “Levels”

1. Secure RPC• SecureRPCEngine for integration with Secure Hadoop,

strong user authentication, message integrity, and encryption on the wire

• Implementation is solid

2. Coprocessor-based add-ons• TokenProvider: Install only if running MR jobs with HBase

RPC security enabled• AccessController: Install on a per table basis, configure

per CF policy, otherwise no overheads• Implementations bring in new runtime dependencies on

ZooKeeper, still considered experimental

Two Implementation “Levels”

1. Secure RPC• SecureRPCEngine for integration with Secure Hadoop,

strong user authentication, message integrity, and encryption on the wire

• Implementation is solid

2. Coprocessor-based add-ons• TokenProvider: Install only if running MR jobs with HBase

RPC security enabled• AccessController: Install on a per table basis, configure

per CF policy, otherwise no overheads• Implementations bring in new runtime dependencies on

ZooKeeper, still considered experimental

Layering

OSOS

Authentication infrastructure: Kerberos + LDAPAuthentication infrastructure: Kerberos + LDAP

HDFSHDFS

HBase MapReduce clientHBase MapReduce client

Hadoop Secure RPCHadoop Secure RPC

TokenProviderTokenProvider

HBaseHBase

MapReduceMapReduce

HBase Secure RPCHBase Secure RPC

AccessController (optional on a per-table basis)AccessController (optional on a per-table basis)

HBase Java clientHBase Java client

Thrift clientThrift clientREST client

REST client

HBase RESTHBase RESTHBase ThriftHBase Thrift

Experience

Secure RPC Engine

• Authentication adds latency at connection setup: Extra round trips for SASL negotiation

• Recommendation: Increase RPC idle time for better connection reuse

• Negotiating message integrity (“auth-int”) takes ~5% off of max throughput

• Negotiating SASL encryption (“auth-conf”) takes ~10% off of max throughput

• Recommendation: Consider your need for such options carefully

Secure RPC Engine

• A Hadoop system including HBase will initiate RPC far more frequently than without (file reads, compactions, client API access, …)

• If the KDC is overloaded then not only client operations but also things like region post deployment tasks may fail, increasing region transition time

• Recommendation: HA KDC deployment, KDC capacity planning, trust federation over multiple KDC HA-pairs

Secure RPC Engine

• Activity swarms may be seen by a KDC as replay attacks (“Request is a replay (34)”)

• Recommendation: Insure unique keys for each service instance, e.g. hbase/host@realm where host is fqdn

• Recommendation: Check for clock skew over cluster hosts• Recommendation: Use MIT Kerberos 1.8• Recommendation: Increase RPC idle time for better

connection reuse• Recommendation: Avoid too frequent HBCK validation of

cluster health

Hadoop Security Issues (?)

• Open issue: Occasional swarms of 5-10 seconds at intervals of about TGT lifetime of:

date time host.dc ERROR [PostOpenDeployTasks: a74847b544ba37001f56a9d716385253] (org.apache.hadoop.security.UserGroupInformation) - PriviledgedActionException as:hbase/host.dc@realm (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

Some Hadoop RPC improvements not yet ported• Speaking of swarms, at or about delegation token

expiration interval you may see runs of:date time host.dc ERROR [DataStreamer for file file block blockId]

(org.apache.hadoop.security.UserGroupInformation) - PriviledgedActionException as:blockId (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException: Block token with block_token_identifier (expiryDate=timestamp, keyId=keyId, userId=hbase, blockIds=blockId, access modes=[READ|WRITE]) is expired.

These should probably not be logged at ERROR level

TokenProvider

• Increases exposure to ZooKeeper related RegionServer aborts: If keys cannot be rolled or accessed due to a ZK error, we must fail closed

• Recommendation: Provision sufficient ZK quorum peers and deploy them in separate failure domains (one at each top of rack, or similar)

• Recommendation: Redundant L2 / L2+L3, you probably have it already

• Recent versions of ZooKeeper have important bug fixes• Recommendation: Use ZooKeeper 3.4.4 (when released)

or higher

For more detail on HBase token authentication:http://wiki.apache.org/hadoop/Hbase/HBaseTokenAuthentication

AccessController

• Use 0.92.1 or above for a bug fix with Get protection

• The AccessController will create a small new “system” table named _ acl _; the data in this table is almost as important as that in .META.

• Recommendation: Use the shell to manually flush the ACL table after permissions changes to insure changes are persisted

• Recommendation: The recommendations related to ZooKeeper for TokenProvider apply equally here

Shell Support

• Shell support is rudimentary, will support the basic use cases

• Note: You must supply exactly the same permission specification to revoke as you did to grant; there is no wildcarding and nothing like revoke all

Demonstration Video

Thank You!