jstorm introduction-0.9.6

44
Company LOGO An Introduction of JStorm LongdaFeng([email protected])

Upload: longda-feng

Post on 06-Jul-2015

776 views

Category:

Internet


0 download

DESCRIPTION

JStorm introduction

TRANSCRIPT

Page 1: Jstorm introduction-0.9.6

Company

LOGO

An Introduction of JStorm

LongdaFeng([email protected])

Page 2: Jstorm introduction-0.9.6

Longda Feng

Alibaba

Agenda

Question and Answer.

Basic Concept & Scenarios

Background

JStorm vs Storm

Why start JStorm?

Page 3: Jstorm introduction-0.9.6

Who are we?

JStorm Team was among one of the earliest that uses Storm in China. Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1

JStorm 0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/…

Our Duties Application Development

JStorm System Development

JStorm System Operation

Longda Feng

Alibaba

Page 4: Jstorm introduction-0.9.6

Who are Using JStorm

Many small Chinese companies are using

JStorm

Longda Feng

Alibaba

Page 5: Jstorm introduction-0.9.6

How Big?

More than 3000 servers

More than 3 trillion messages per day

Longda Feng

Alibaba

Page 6: Jstorm introduction-0.9.6

What is JStorm?

JStorm is a distributed programming

framework

Similar to Hadoop MapReduce but designed

for real-time/in-memory scenarios

Users can build powerful distributed

applications from very simple APIs

Longda Feng

Alibaba

Page 7: Jstorm introduction-0.9.6

What is JStorm?

Redesigned Storm in Java.

Proved stable running in huge clusters.

Much faster

Much more powerful

Longda Feng

Alibaba

Page 8: Jstorm introduction-0.9.6

Basic Conception

Pipe-lined data processing

Longda Feng

Alibaba

Page 9: Jstorm introduction-0.9.6

Advantage 1

Easy learning:

Simple Building Blocks: Topology/Spout/Bolt

APIs

Out of Box RPC/Fault-tolerance/Real-time

Data Grouping & Combining

Longda Feng

Alibaba

Page 10: Jstorm introduction-0.9.6

Advantage 2

Excellent Scalability

Horizontally Scalable

DAG-based

Adjustable parallelism of each component

Longda Feng

Alibaba

Page 11: Jstorm introduction-0.9.6

Stable

Guarantees Fault-Tolerance

No Single Point of Failure

• Nimbus HA

• Any Supervisor can be shutdown

New worker will be spawned and replace the

failed one automatically

Longda Feng

Alibaba

Page 12: Jstorm introduction-0.9.6

Accuracy

Acking framework guarantees no lost of

data

Transaction framework guarantees data

accuracy.

Longda Feng

Alibaba

Page 13: Jstorm introduction-0.9.6

Scenarios

Stateless Computation

All data come from Tuple

Use Cases:

Log Analysis

Pipe-lined System

Message converter

Statistical Analysis

Real-time Recommendation Algorithm

Longda Feng

Alibaba

Page 14: Jstorm introduction-0.9.6

Longda Feng

Alibaba

Why start JStorm

Storm community is not as active as we’ve expected

Tailored for enterprise environment

Fixed critical bugs in Storm

Provided professional technical support, improved app development pace.

Reduced operational cost.

Page 15: Jstorm introduction-0.9.6

How Many Versions?

https://github.com/alibaba/JStorm/releases 0.9.6(2014/9/22)

0.9.5.1(2014/9/14)

0.9.5 (2014/8/27)

0.9.4.1 (2014/8/15)

0.9.4(2014/7/18)

0.9.3.1 (2014/5/31)

0.9.3 (2014/5/10)

0.9.2 (2014/4/8)

0.9.1(2014/1/24)

0.9.0(2013/12/30)

0.7.1(2013/4/28)Longda Feng

Alibaba

Page 16: Jstorm introduction-0.9.6

JStorm is a superset of Storm

The program run in Storm can run in

JStorm without changing code

Longda Feng

Alibaba

Page 17: Jstorm introduction-0.9.6

More stable (1) -- nimbus HA

Nimbus HA

Dual-Nimbus HA

Longda Feng

Alibaba

Page 18: Jstorm introduction-0.9.6

More stable (2) -- RPC

Netty supports 2 RPC modes

Async

Sync

• Sending speed keeps up with the receiving speed,

therefore the data flow is more stable.

Longda Feng

Alibaba

Page 19: Jstorm introduction-0.9.6

More stable(3) – resource isolation

Malicious Worker won’t mess up with

others

Supported CPU Isolation with cgroups

Supported Memory Isolation

Resources quota can be enforced on each

group (before 0.9.5)

Longda Feng

Alibaba

Page 20: Jstorm introduction-0.9.6

More stable(4) -- Monitor

Monitor every component in your

Topology

Many more metrics(70+) than storm

Supported user-defined metrics

Supported user-defined alerts

Longda Feng

Alibaba

Page 21: Jstorm introduction-0.9.6

More stable (5) – CPU usage

Better utilizing CPU resource

Improved disruptor implementation

• Drop CPU usage from 300% to 10% when

processing queue is full

Avoid CPU spin-waiting

• Relocating nextTuple/ack/fail work to a different

thread

Longda Feng

Alibaba

Page 22: Jstorm introduction-0.9.6

More stable(6) -- more catch

Add try-catch in any place.

Nimbus/supervisor main thread

Spout/bolt initialization/cleanup

All IO operation, serialization/deserialization

All ZK operation

Longda Feng

Alibaba

Page 23: Jstorm introduction-0.9.6

More stable(7) -- ZK

Reduced unnecessary ZK usage:

Removed useless watcher

Increased ZK heartbeat frequency

Detect failed worker without a full scan of the

entire ZK directory

Longda Feng

Alibaba

Page 24: Jstorm introduction-0.9.6

More stable(8) -- other

Improved GC Tuning.

Guaranteed that all workers killed after kill

command is issued

Guaranteed single supervisor/nimbus per

instance

Avoid excessive use of local ports by

Netty client

。。。

Longda Feng

Alibaba

Page 25: Jstorm introduction-0.9.6

More powerful scheduler

Balancing Tasks with regard of :

CPU

Memory

Net

Longda Feng

Alibaba

Page 26: Jstorm introduction-0.9.6

CPU assignment

By default assign each worker a single

CPU slot

Application can be configured to utilize

more slots

Why:

Some task creates extra threads to do other

things in Alimama, one CPU slot doesn’t meet

requirement

Longda Feng

Alibaba

Page 27: Jstorm introduction-0.9.6

Memory Usage

Default worker memory is 2G

Application can be configured to utilize

more memory slots

Why:

In Alipay Mdrill application, Solr bolt will apply

much more memory

Longda Feng

Alibaba

Page 28: Jstorm introduction-0.9.6

Smarter Balancing

With JStorm Scheduler:

Tasks that exchange data heavily tend to be

assigned to the same worker to avoid

networking cost.

Longda Feng

Alibaba

Page 29: Jstorm introduction-0.9.6

User Defined Scheduler

User define task run one designated

worker

User can setting how many CPU slot /memory

slot will be used

Why:

In Taobao TAE project, some bolts want to

run in user defined-nodes

Longda Feng

Alibaba

Page 30: Jstorm introduction-0.9.6

Task on Different Node

Task of one component can be scheduled

to run on different nodes

Why:

In ALIPAY Mdrill, Solr bolt must run different

node

Longda Feng

Alibaba

Page 31: Jstorm introduction-0.9.6

Task on Single Node

All tasks can be scheduled to run on a

single node.

Why:

In Taobao TLog, there are many small jobs, in

order to reduce network cost, all task of one

job must run on single node.

Longda Feng

Alibaba

Page 32: Jstorm introduction-0.9.6

Old Assignment

“Last Assignment Policy”

By default , a task will run on the machine it

runs previous time

Why:

In Alibaba CDO, When restart one application,

user wanted to reuse old workers

Longda Feng

Alibaba

Page 33: Jstorm introduction-0.9.6

Pluginable

Be able to run on:

Hadoop yarn(more stable than storm)

Alibaba Apsara Clould System

Alibaba Elastic Resource Pool

Longda Feng

Alibaba

Page 34: Jstorm introduction-0.9.6

Classloader

Resolved application jar-confliction with

JStorm

Longda Feng

Alibaba

Page 35: Jstorm introduction-0.9.6

More convenient UI

More useful stats collected and displayed.

Browse Worker Log in UI

Longda Feng

Alibaba

Page 36: Jstorm introduction-0.9.6

Support libjar

Don’t need assembly all dependency jars

into one jar

Submit libjar with libjar parameter

Support worker.classpath

Longda Feng

Alibaba

Page 37: Jstorm introduction-0.9.6

Faster

6 Servers (24core/98G)

18 Spout/18 Bolt/18 Acker

Longda Feng

Alibaba

9280598

10818815

9065965

6819139

5610201

62436806830500

5595900 5474180

3379800

0

2000000

4000000

6000000

8000000

10000000

12000000

0 10 20 30 40 50 60

pollt

uple

s/1

0s

workers

Throughput vs workers

jstorm

storm

Page 38: Jstorm introduction-0.9.6

JStorm 41W/S Sending Speed

Longda Feng

Alibaba

Page 39: Jstorm introduction-0.9.6

Storm 41W/S Sending Speed

Longda Feng

Alibaba

Page 40: Jstorm introduction-0.9.6

Why Faster

Reduce memory-copying by zeroMq

Dedicated Deserializing Thread

Better Tuned Sampling Logic

Better Tuned Acking Framework

Better Tuned GC

Longda Feng

Alibaba

Page 41: Jstorm introduction-0.9.6

Other Improvement

More than 100 improvements

https://github.com/alibaba/JStorm/blob/master/history.md

Fixed assign topology competition

Reset rebalance/reassigned worker timeout as 4 minutes

Graceful worker shutdown

Improvement on thrift server

Avoid mistakenly killing of worker while rebalancing jobs.

。。。。

Longda Feng

Alibaba

Page 42: Jstorm introduction-0.9.6

More document

https://github.com/alibaba/JStorm/wiki

Google-group:[email protected]

Wangwang:JStorm

QQ:228374502

Laiwang: JStorm

Longda Feng

Alibaba

Page 43: Jstorm introduction-0.9.6

Join us

Welcome to Join us

[email protected]

Longda Feng

Alibaba

Page 44: Jstorm introduction-0.9.6

Company

LOGO

纪君祥(Longda Feng)