functional comparison and performance evaluation 毛玮files.meetup.com › 16395762 › streaming...

50
Functional Comparison and Performance Evaluation 毛玮 王华峰 张天伦 2016/9/10

Upload: others

Post on 28-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Functional Comparison and Performance Evaluation 毛玮

王华峰

张天伦

2016/9/10

Page 2: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Overview

Streaming Core

MISC

Performance Benchmark

Choose your weapon !

2*Other names and brands may be claimed as the property of others.

Page 3: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮
Page 4: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Apache Spark Streaming*

AapcheFlink*

ApacheStorm*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

This is the critical part, as it affects many features

Micro-Batch

Checkpoint per Batch

Continuous Streaming

Checkpoint “per Batch”

Source Operator Sink

Acker

Source Operator Sink

JobManager/HDFS

id offset state str ack

Source Operator Sink

Driver

Storage Storage

job status

HDFS

id offset state str

Continuous Streaming

Ack per Record

Storage

*Other names and brands may be claimed as the property of others. 4

Page 5: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Low Latency High Latency

High ThroughputLow Throughput

High Overhead Low Overhead

5

Apache Spark Streaming*

AapcheFlink*

ApacheStorm*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

*Other names and brands may be claimed as the property of others.

Micro-Batch

Checkpoint per Batch

Continuous Streaming

Checkpoint “per Batch”

Continuous Streaming

Ack per Record

Page 6: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Delivery Guarantee

At least once Exactly once

• Ackers know about if a record is processed successfully or not. If it failed, replay it.

• There is no state consistency guarantee.

• State is persisted in durable storage

• Checkpoint is linked with state storage per Batch

6

Apache Spark Streaming*

AapcheFlink*

ApacheStorm*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

*Other names and brands may be claimed as the property of others.

Page 7: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Native State Operator

Yes* Yes Yes

• Flink Java API: ValueState ListState ReduceState

• Flink Scala API: mapWithState

• Gearpump persistState

• Spark 1.5: updateStateByKey

• Spark 1.6: mapWithState

• Trident: persistentAggregate State

• Storm: KeyValueState

• Heron:X User Maintain

7

Apache Spark Streaming*

AapcheFlink*

ApacheStorm*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

*Other names and brands may be claimed as the property of others.

Page 8: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Dynamic Load Balance & Recovery Speed

Source

exec

exec

exec

10s + 5s = 15s

5s Source

exec

exec

exec

10s

10s

10s10s + 5s = 15s

8

Apache Spark Streaming*

AapcheFlink*

ApacheStorm*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

*Other names and brands may be claimed as the property of others.

Page 9: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮
Page 10: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Compositional

• Highly customizable operator based on basic building blocks

• Manual topology definition and optimization

TopologyBuilder builder = new TopologyBuilder();builder.setSpout(“input", new RandomSentenceSpout(), 1);builder.setBolt("split", new SplitSentence(), 3).shuffleGrouping("spout");builder.setBolt("count", new WordCount(), 2).fieldsGrouping("split", new Fields("word"));

“foo, foo, bar” “foo”, “foo”, “bar” {“foo”: 2, “bar”: 1}

Spout Bolt Bolt

10*Other names and brands may be claimed as the property of others.

ApacheStorm*

ApacheGearpump*

TwitterHeron*

Page 11: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Declarative

• Higher order function as operators (filter, mapWithState…)

• Logical plan optimization

DataStream<String> text = env.readTextFile(params.get("input"));DataStream<Tuple2<String, Integer>> counts = text.flatMap(new Tokenizer()).keyBy(0).sum(1);

“foo, foo, bar” “foo”, “foo”, “bar” {“foo”: 1, “foo”: 1, “bar”: 1} {“foo”: 2, “bar”: 1}

11*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

AapcheFlink*

Apache Storm Trident*

ApacheGearpump*

Page 12: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Statistical

• Data scientist friendly

• Dynamic type

Python

lines = ssc.textFileStream(params.get("input"))words = lines.flatMap(lambda line: line.split(“,"))pairs = words.map(lambda word: (word, 1))counts = pairs.reduceByKey(lambda x, y: x + y)counts.saveAsTextFiles(params.get("output"))

Rlines <- textFile(sc, “input”)words <- flatMap(lines, function(line) {

strsplit(line, “ ”)[[1]]})

wordCount <- lapply(words, function(word) {list(word, 1L)

}counts <- reduceByKey(wordCount, “+”, 2L)

˚StructuredStreaming*

12*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

ApacheStorm*

TwitterHeron*

˚ApacheStorm*

Page 13: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

SQL

CREATE EXTERNAL TABLE ORDERS (ID INT PRIMARY KEY, UNIT_PRICE INT, QUANTITY

INT) LOCATION 'kafka://localhost:2181/brokers?topic=orders' TBLPROPERTIES '{...}}‘

INSERT INTO LARGE_ORDERS SELECT ID, UNIT_PRICE * QUANTITY

AS TOTAL FROM ORDERS WHERE UNIT_PRICE * QUANTITY > 50

bin/storm sql XXXX.sql

InputDStream.transform((rdd: RDD[Order], time: Time) => {

import sqlContext.implicits._rdd.toDF.registAsTempTableval SQL = "SELECT ID, UNIT_PRICE * QUANTITY

AS TOTAL FROM ORDERS WHERE UNIT_PRICE * QUANTITY > 50"

val largeOrderDF = sqlContext.sql(SQL)largeOrderDF.toRDD

})

Fusion Style Pure Style

13*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

AapcheFlink*

StructuredStreaming

Apache Storm Trident*

Page 14: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Summary

Compositional Declarative Python/R SQL

X √ √ √

√ X √ NOT support aggregation,

windowing and joiningX √ X

√ √ X X

X √ XSupport select,

from, where, union

√ X √˚ X

14*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

ApacheStorm*

AapcheFlink*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

Page 15: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮
Page 16: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

• Multi Tasks of Multi Applications on Single Process

JVM Process Connect

with local SM

Thread Thread

Task

• Single Task on Single Process

Thread Thread

Task Task

JVM Process

Thread Thread

Task Task

JVM Process

Thread

Task

task from application A task from application BTaskTask

JVM Process Connect

with local SM

Thread

Task

Thread

16*Other names and brands may be claimed as the property of others.

TwitterHeron*

AapcheFlink*

Page 17: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

• Multi Tasks of Single application on Single Process

o Single task on single thread

o Multi tasks on single thread

Thread

Task

Thread

Task

Task

Task

Task

JVM Process

Thread Thread

Task Task

JVM Process

Thread Thread

Task Task

JVM Process

Thread

Task

Thread

Task

Thread

TaskTask

JVM Process

17*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

ApacheStorm*

Apache Storm Trident*

ApacheGearpump*

Page 18: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

● Window Support ● Out-of-order Processing ● Memory Management

● Resource Management ● Web UI ● Community Maturity

Page 19: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Window Support

• Sliding Window

smaller than gap

session gap

t t

• Count Window

• Session Window

Sliding Window Count Window Session Window

√ X X˚

√ √ X

√ √ X

√˚ X X

√ √ √

X X X

Apache Spark Streaming*

Apache Flink*

Apache Storm*

Apache Storm Trident*Apache

Gearpump*

Apache Heron*

19*Other names and brands may be claimed as the property of others.

Page 20: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Out-of-order Processing

Processing Time Event Time Watermark

√ √˚ X˚

√ √ √

√ X X

√ √ √

√ √ √

√ X X

20*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

ApacheStorm*

AapcheFlink*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

Page 21: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Memory Management

JVM Manage Self Manage on-heap Self Manage off-heap

√ √˚ √˚

√ √ √

√ X X

√ X X

√ X X

21*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

AapcheFlink*

ApacheStorm*

ApacheGearpump*

TwitterHeron*

Page 22: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Resource Management

Standalone YARN Mesos

√ √ √

√ √˚ √˚

√ √˚ √˚

√ √ X

√ √ X

√ √ √

22*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

ApacheStorm*

AapcheFlink*

Apache Storm Trident*

ApacheGearpump*

TwitterHeron*

Page 23: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Web UI

Submit Jobs

CancelJobs

InspectJobs

ShowStatistics

ShowInput Rate

CheckExceptions

InspectConfig

Alert

X √ √ √ √ √ √ X

X √ √ √ √˚ √ √ X

√ √ √ √ √˚ √ √ X

√ √ √ √ X √ √ X

X X √ √ √˚ √ √ X

ApacheSpark

Streaming*

ApacheFlink*

ApacheStorm*

ApacheGearpump*

23*Other names and brands may be claimed as the property of others.

TwitterHeron*

Page 24: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

2161

237 161514

770

500

1000

1500

2000

2500

Spark Storm Gearpump Flink Heron

Past 3 Months Summary on JIRA

Created Resloved

780

217

21

184 13010220 5 34 20

0

200

400

600

800

1000

Spark Storm Gearpump Flink Heron

Past 1 Months Summary on GitHub

Commits Committor

Community Maturity

Initiation Time

Apache Top

Project

Contributors

2013 2014 926

2011 2014 219

2014 Incubator 21

2010 2015 208

2014 N/A 44

24

*Other names and brands may be claimed as the property of others.

ApacheSpark

Streaming*

ApacheStorm*

ApacheGearpump*

ApacheFlink*

TwitterHeron*

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

Source website: https://issues.apache.org/jira/secure/Dashboard.jspa

Source website: https://github.com/apache/spark/pulse/monthly

Page 25: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

HiBench 6.0

Page 26: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

• “Lazy Benchmarking”

• Simple test case infer practical use case

Test Philosophical

26

Page 27: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

The SetupApache Kafka* Cluster

• CPU: 2 x Intel(R) Xeon(R) CPU E5-

2699 v3@ 2.30GHz

• Mem: 128 GB

• Disk: 8 x HDD (1TB)

• Network: 10 Gbps

10

Gb

ps

Test Cluster

• CPU: 2 x Intel(R) Xeon(R) CPU E5-

2697 v2@ 2.70GHz

• Core: 20 / 24

• Mem: 80 / 128 GB

• Disk: 8 x HDD (1TB )

• Network: 10 Gbps

x7

x3Name Version

Java 1.8

Scala 2.11.7

Apache Hadoop* 2.6.2

Apache Zookeeper* 3.4.8

Apache Kafka* 0.8.2.2

Apache Spark* 1.6.1

Apache Storm* 1.0.1

Apache Flink* 1.0.3

Apache Gearpump* 0.8.1

• Apache Heron* require specific Operation System (Ubuntu/CentOS/Mac OS)

• Structured Streaming doesn’t support Kafka source yet (Spark 2.0)

27*Other names and brands may be claimed as the property of others.

Page 28: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Architecture

Test Cluster (Standalone)

Data Generator

Metrics ReaderFile System

Topic A

Kafka Broker

Kafka Broker

Kafka Broker

Client Master

Slave

20 Core80G Mem

Slave

20 Core80G Mem

Slave

20 Core80G Mem

Slave

20 Core80G Mem

Slave

20 Core80G Mem

Slave

20 Core80G Mem

Slave

20 Core80G Mem

Topic A

To

pic

B

Result

In Time

Out Time

Out Time – In Time

28

Page 29: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Framework Configuration

Framework Related Configuration

7 Executor140 Parallelism

7 TaskManager140 Parallelism

28 Worker140 KafkaSpout

28 Executors140 KafkaSource

29*Other names and brands may be claimed as the property of others.

Apache Spark Streaming*

ApacheStorm*

AapcheFlink*

ApacheGearpump*

Page 30: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Raw Input Data

• Kafka Topic Partition: 140

• Size Per Message (configurable): 200 bytes

• Raw Input Message Example:

“0,227.209.164.46,nbizrgdziebsaecsecujfjcqtvnpcnxxwiopmddorcxnlijdizgoi,1991-06-10,0.115967035,Mozilla/5.0 (iPhone; U; CPU like Mac OS X)AppleWebKit/420.1 (KHTML like Gecko) Version/3.0 Mobile/4A93Safari/419.3,YEM,YEM-AR,snowdrops,1”

• Strong Type: class UserVisit (ip, sessionId, browser)

• Keep feeding data at specific rate for 5 minutes5 minutes

30

Page 31: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Data Input Rate

Throughput Message/Second Kafka Producer Num

40KB/s 0.2K 1

400KB/s 2K 1

4MB/s 20K 1

40MB/s 200K 1

80MB/s 400K 1

400MB/s 2M 10

600MB/s 3M 15

800MB/s 4M 20

31

Page 32: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

32

Page 33: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Test Case: Identity

The application reads input data from Kafka and then writes result to Kafka immediately, there is no complex business logic involved.

33

Page 34: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Result

0

1

2

3

4

5

6

7

8

0 100 200 300 400 500 600 700 800Input Rate (MB/s)

P99 Latency (s)

Apache Spark* Apache Flink*

Apache Storm* without Ack Apache Storm* with Ack

34

*Other names and brands may be claimed as the property of others. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

Page 35: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

35

Page 36: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Test Case: Repartition

Basically, this test case can stand for the efficiency of data shuffle.

Network Shuffle

36

Page 37: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Result

0

100

200

300

400

0 200 400 600 800Input Rate (MB/s)

P99 Latency (s)

Apache Spark*Apache Flink*Apache Storm* without AckApache Gearpump*Apache Storm* with Ack

0

200

400

600

800

0 200 400 600 800Input Rate (MB/s)

Throughput (MB/s)

Apache Spark*

Apache Flink*

Apache Storm* without Ack

Apache Gearpump*

Apache Storm* with Ack

37

*Other names and brands may be claimed as the property of others. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

Page 38: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Observation

• Flink and Storm has close performance and are better choices to meet sub-second SLA requirement if no repartition happened.

• Spark Streaming need to schedule task with additional context. Under tiny batch interval case, the overhead could be dramatic worse compared to other frameworks.

• According to our test, minimum Batch Interval of Spark is about 80ms (140 tasks per batch), otherwise task schedule delay will keep increasing

• Repartition is heavy for every framework, but usually it’s unavoidable.

• Latency of Gearpump is still quite low even under 800MB/s input throughput.

38

Page 39: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

39

Page 40: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Test Case: Stateful WordCount

Native state operator is supported by all frameworks we evaluated

Stateful operator performance + Checkpoint/Acker cost

40

Page 41: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Result

0

20

40

60

80

100

0 200 400 600 800

Input Rate (MB/s)

P99 Latency (s)

Apache Spark* Apache Flink*

Apache Flink* without CP Apache Storm*

Apache Gearpump*

0

100

200

300

400

500

600

700

800

0 200 400 600 800

Input Rate (MB/s)

Throughput (MB/s)

Apache Spark* Apache Flink*

Apache Storm* Gearpump*

41

*Other names and brands may be claimed as the property of others. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

Page 42: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Observation

• Exactly-once semantics usually require state management and checkpoint. But better guarantees come at high cost.

• There is no obvious performance difference in Flink when switching fault tolerance on or off.

• Checkpoint mechanisms and storages play a critical role here.

42

Page 43: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

43

Page 44: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Test Case: Window Based Aggregation

This test case manages a 10-seconds sliding window

44

Page 45: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Result

0

20

40

60

80

100

120

140

160

180

200

0 200 400 600 800

Input Rate (MB/s)

P99 Latency (s)

Apache Spark* Apache Flink* Storm*

0

100

200

300

400

500

600

0 200 400 600 800

Input Rate (MB/s)

Throughput (MB/s)

Apache Spark* Apache Flink* Storm*

45

*Other names and brands may be claimed as the property of others. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

Page 46: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

The native streaming execution model helps here

Observation

46

Apache Spark Streaming*

ApacheStorm*

AapcheFlink*

*Other names and brands may be claimed as the property of others.

Page 47: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

47

Page 48: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Do your own benchmark

HiBench : a cross platforms micro-benchmark suite for big data

(https://github.com/intel-hadoop/HiBench)

Open Source since 2012

Better streaming benchmark supporting will be included in next release [HiBench 6.0]

48

Page 49: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮
Page 50: Functional Comparison and Performance Evaluation 毛玮files.meetup.com › 16395762 › Streaming Report_Meetup... · Functional Comparison and Performance Evaluation 毛玮

Legal DisclaimerNo license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Configurations:

Hardware:

Apache Kafka* Cluster - CPU: 2 x Intel(R) Xeon(R) CPU E5-2699 v3@ 2.30GHz, Mem: 128 GB, Disk: 8 x HDD (1TB), Network: 10 Gbps.

Test Cluster - CPU: 2 x Intel(R) Xeon(R) CPU E5-2697 v2@ 2.70GHz,Core: 20 / 24, Mem: 80 / 128 GB, Disk: 8 x HDD (1TB ), Network: 10 Gbps.

Software:

the software framework configuration is shown in page 29. The test results in page 34, 37, 41 and 45 used above configurations.

For more details of above configurations, contact Wei Mao ( [email protected] ) or Huafeng Wang ( [email protected] ).

*Other names and brands may be claimed as the property of others.

Copyright ©2016 Intel Corporation.50