play with streams

78
Play With Streams [email protected] Jun, 2014

Upload: tianjian-chen

Post on 14-Jul-2015

143 views

Category:

Engineering


3 download

TRANSCRIPT

Play With Streams

[email protected]

Jun, 2014

About Me

• Principal Architect @Baidu.com

• Contract Programmer

– C/C++/Python

• Post Engineering Disorder Therapist

Post Engineering Disorder

Murphy’s Law

• Everything that Can Go Wrong, Goes Wrong

– Unstable Servers

– Unstable Network

– Unstable Data Source

– Unstable Managers

– Unstable…

Agenda

• Section I: A Scratch

– Brief Intro to Streams

• Section II: Build From Ground Up

– Modern Stream Processing Architecture

• Section III: Could Be Much Sexier

– Stream Evolution in Progress

Section I: A Scratch

Brief Intro to Streams

Highlights

• Natural Beauty of Streams

• Stream Processing Design Language(SDL)

• Scratching Stream Applications

Streams are Everywhere

Definition of Stream

• A Series Of Data Packs

• Data is Structured or Semi-Structured

• With Internal Topologies

– DAG in most cases

Traffic Cam Network

Wireless Sensor Network

Recommender System

Stream Description Language

• Normalize

• Accumulate

• Join

• Partition

• Filtering

Normalize

NORMMessage Type A

Message Type B

Standard Message

Accumulate

ACCU

BUFFER

Message AccumulatedBuffers

Common Accumulator Types

• Time Window

• Volume

• Condition Trigger

Join

JOIN

Indexed Info.

MessageWith a key

Joined Message

Common Join Rules

• Internal Mode

– Indexing Only Input Messages

– Limited Time Window

• External Mode

– Indexing External Data

– Extreme Useful In Integrating Asynchronous Components

Partition

PARTMessageWith a key

Common Partition Rules

• Feature Based– Application Related

• Random – Balancing Load

• Hash– Aggregating Data by User Defined Key

• Replication– For Improving Availability

Filtering

FILT

Drop

Pass ThroughInput Messages

Pit Stop

• Normalize

• Accumulate

• Join

• Partition

• Filtering

Distinct Count

A Simple Hello World:Word Count

Text Parser ACCU

An Improved Version

Text Parser PART ACCU

ACCU

ACCU

NORM

Sum Up!

• Streams are everywhere

• We have a modeling tool to describe stream processing flows(SDL)

• Proceed to build real systems

See U In Next Hour

Section II: Build From Ground Up

Modern Stream Processing Architecture

Highlights

• Flow Means Non-Stopping

• Reliability Matters

• Workload Fluctuation Handling

Message Relay System Basics

• Message Queuing

• Message Passing

• Hybrid Queuing

With Message Queuing

Operator 1

Queue A Queue B Queue C

Operator 2

Message Passing

Operator 1

Operator 2

Operator 3

Hybrid Queuing

Operator 1

Queue A Queue B

Operator 2

Operator 3

Pit Stop

• Message relay is the foundation of stream processing

• 3 basic methods of message relay

Why Reliability Matters?

• We got mission critical applications

– Real time stock exchange analysis

– Ads network click monitoring & billing

Reliability Solutions

• Upstream Backup & Replay

• Source Backup & Replay

• Processing Status Backup & Replay

Upstream Backup & Replay

Upstream Downstream Upstream Downstream

ACK ACK

Source Backup & Replay

Upstream Downstream Upstream Downstream

ACK

ACK

Processing Status Backup & Replay

UpstreamDownstream Upstream

Downstream

Stream Operator

Downstream Upstream

Stream Operator Shadow

Status Synchronize Messages

RedirectRedirect

While Reliability Hurts Performance

• All Reliability Solutions are Based on

– Indexing

– Snapshot

– Replay

• A Club Of Performance Penalties

Tuning Strategies

• State Operators VS. Stateless Operators

– Independent State Storage

– White Board Programming Model

– Lazy State Synchronization

• Micro Batching Snapshot

Micro Batching

Batch Size Time Window Throughput Snapshot Cost Restore Cost

1 1ms 1x very high very low

10 10ms 10x high low

100 100ms 100x medium low

1000 100ms 1000x medium low

10000 1s 5000x low high

Most Systems Are Here

May Constrained By Network Configuration

Pit Stop

• Reliability is based on

– Message Backup & Replay

– Status Snapshot

• When tuning, think of

– How to handle operator status

– Micro Batching

Workload Fluctuation

How Does Fluctuation Happen?

• Data Source Fluctuation

• Fault Tolerance Operations

Fluctuation Handling

• Technologies To Obtain

– High Performance RPC Framework

– Auto Partitioning

– Dynamic Resource Allocation

– Global Flow Control

High Performance RPC Framework

• Indication of High Performance

– Over 20k QPS/sec, with 1byte payload

– On commodity server with 2 6-core CPU and Giga Ethernet

• See Also

– SOFA Framework from Baidu.com

– https://github.com/BaiduPS/sofa-pbrpc

Auto Partitioning

• Adjust Sub-Stream Num.

• Tricks:

– Consistent Hashing

– Exponential Allocation

Dynamic Resource Allocation

• Tree Model

• Evaluation Function

• Network Constrains

IDC A

IDC B

PHY1

PHY2

PHY3

PHY4

VM 1

VM 2

VM 3

Global Flow Control

Op1

Op2

Op3

Op4

Flow Control Service

Pit Stop

• Workload Fluctuation Handling

– Very Fast RPC Framework

– Auto Partitioning

– Dynamic Resource Allocation

– Global Flow Control

Sum Up!

• Basic Architecture of Stream Processing System

• Reliability of Stream Processing

• Ways to Handle Workload Fluctuations

See U in the Afternoon

Section III: Could Be Much Sexier

Stream Evolution in Progress

Highlights

• Challenges in Real World Applications

• High Level Stream Programming

• Optimization Inside Hardware

Applications

• Stream Web Crawler

• Co Serving with Hadoop M/R

Stream Web Crawling

OP1

OP2

Redis Cluster

Web Page Cache

OP3 OP4

Log Filter

Data Join

FeatureExtraction

Logging API

Crawling JobUpdater

OP5User-Model

Updater

OP6HBase Cluster

Web DataBase

Crawling JobGenerator

OP7 Crawling Bot

OP8

Image Crawling JobGenerator

OP9

Cache Synchronizer

OP10

Image Crawling Cluster

StatusDB

Co-Serving

• Distribute RPC

• Dynamic Routing

OP1

OP2

OP5

OP6

Online Web Services

QueryPreprocess

QueryTransform

ResultMerge

IntentExtraction

OnlineQuery Log

OP3 OP4User

IntentMining

Hadoop

Initiating DRPC from Mapper

Mapper Receiving DRPC result

Pit Stop

• Programming Complexity

• Network Management Complexity

• Scale-out Difficulty

High Level Programming

• Stream DataBase & StreamSQL

• Stream Computing Description Language

• Stream Programming Framework

StreamSQL

Stream DML Stack(Naiad)

Stream Framework

Pit Stop

Programming Interface Complexity Flexibility

Stream SQL Low Low

Stream DML Medium High

Stream Framework High Very High

Network Management Challenges

• Things Are App. Dependent

– QoS Request

– Relay Priority

– Security Strategy

– Resource Allocation

Network Management Solution

SDNIntegration

Stream App Flow Control

Network Configuration

SDN Integration

Stream Flow Control Service

Monitors on Streams

Open Flow Supported Network Devices

Pit Stop

• Network Management is Crucial

– Stream Processing is Bandwidth Consuming

– Stream Processing may be Latency Sensitive

• Solution is Simple

– Software Defined Network Integration (SDN)

Scale-Out Difficulty

• Application Constrains

• Algorithm Constrains

• Network Hardware Constrains

So Scale-Up !

• Co-Process Device

• Re-Programmable Co-Processors

FPGA Computing Card

• Cheaper than GPGPU

• Flexible than GPGPU

• Higher POE than GPGPU

New Re-Configurable Xeon CPU

See Also

• A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services

– 20x Performance

– ISCA 2014 by Microsoft Research

– http://research.microsoft.com/pubs/212001/Catapult_ISCA_2014.pdf

Pit Stop

• Scale-out is difficult, think of scale-up

• Reconfigurable CPU has got significant performance improvements

Conclusion

• Stream Processing System can be Well Modeled by SDL

• Trade Off between Reliability & Performance

• High level programming & Scale-Up are Future Trends

References

• Stonebraker, Michael, Uǧur Çetintemel, and Stan Zdonik. "The 8 requirements of real-time stream processing." ACM SIGMOD Record 34, no. 4 (2005): 42-47.

• Zaharia, Matei, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. "Discretized streams: Fault-tolerant streaming computation at scale." In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 423-438. ACM, 2013.

• Murray, Derek G., Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martin Abadi. "Naiad: a timely dataflow system." In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 439-455. ACM, 2013.

• Castro Fernandez, Raul, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. "Integrating scale out and fault tolerance in stream processing using operator state management." In Proceedings of the 2013 international conference on Management of data, pp. 725-736. ACM, 2013.

Thank U!

• G+: [email protected]

• Skype: tianjian_chen

• Linkedin: http://lnkd.in/bRN6xsh