azure 上基于 spark streaming...
TRANSCRIPT
![Page 1: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/1.jpg)
Azure 上基于 Spark Streaming的数据流实时计算 左继红
ACP-B205
![Page 2: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/2.jpg)
Azure 上的实时计算
![Page 3: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/3.jpg)
![Page 4: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/4.jpg)
![Page 5: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/5.jpg)
![Page 6: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/6.jpg)
Spark Streaming 概念和功能
![Page 7: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/7.jpg)
![Page 8: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/8.jpg)
![Page 9: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/9.jpg)
![Page 10: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/10.jpg)
![Page 11: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/11.jpg)
![Page 12: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/12.jpg)
EventHub 示例:
val stream = EventHubUtils.createStream(ssc, eventHubName, partitionNum, consumerGroupName)
![Page 13: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/13.jpg)
![Page 14: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/14.jpg)
![Page 15: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/15.jpg)
val dataset: RDD[Int, String] = … val metricsDS: DStream[Int, SensorMetrics] = stream.window(Seconds(3), Seconds(2)) val joinedDS: Dstream[Int, (SensorMetrics, String)] = metricsDS.transform(rdd => rdd.join(dataset))
![Page 16: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/16.jpg)
val computeMeanFunc = (values: Seq[SensorMetrics], state: Option[SensorState]) => { val back_ax_vals = values.map(_.getSensorReading("back").get.ax) val back_ax_mean = back_ax_vals.reduce(_+_) / values.size val back_ax_dev = Math.pow(back_ax_vals.map(x => Math.pow(x-back_ax_mean, 2)). reduce(_+_) / values.size, 0.5) ... }
![Page 17: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/17.jpg)
![Page 18: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/18.jpg)
集成 EventHub
![Page 19: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/19.jpg)
并行结构,避免资源竞争 事件可保存多天,可反复读取 可通过Throughput Unit控制性能
![Page 20: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/20.jpg)
EventData
Offset Sequence number Body User properties System properties
Event Hub
Partition1
Partition2
Partition3
Partition4
事件按接收的时间存储
Offset: 字节偏移量
![Page 21: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/21.jpg)
每个EventHubReceiver对应一个EventHub Partition 使用EventHubs Java client 底层使用Apache Qpid库访问EventHub,基于AMQP协议
![Page 22: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/22.jpg)
EventHub的数据持久化存储 ResilientEventHubReceiver的自动恢复 Offset的定时checkpoint Metadata、RDD data定时checkpoint
![Page 23: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/23.jpg)
Reliable Receiver: 当数据被成功接收并可靠存储后,向源发送确认 Unreliable Receiver: 不向源发送确认
Unreliable Receiver 通过offset checkpointing保证数据的可靠接收 Offset被存储于Azure Blob Storage
![Page 24: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/24.jpg)
![Page 25: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/25.jpg)
Azure 上的 Spark 集群部署
![Page 26: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/26.jpg)
![Page 27: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/27.jpg)
![Page 28: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/28.jpg)
![Page 29: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/29.jpg)
![Page 30: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/30.jpg)
![Page 31: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/31.jpg)
![Page 32: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/32.jpg)
![Page 33: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/33.jpg)
演示: 使用 Spark Streaming 实现动作信号的分析
![Page 34: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/34.jpg)
![Page 35: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/35.jpg)
Azure 上实时分析工具的比较
![Page 36: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/36.jpg)
![Page 37: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/37.jpg)
![Page 38: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/38.jpg)
![Page 39: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/39.jpg)
课后提醒
![Page 40: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/40.jpg)
https://channel9.msdn.com/Events/Ignite/Microsoft-Ignite-China-2015
http://aka.ms/IgniteChina2015
![Page 41: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/41.jpg)
![Page 42: Azure 上基于 Spark Streaming 的数据流实时计算download.microsoft.com/download/3/6/F/36F2B2A6-36E4-4966...Azure 上基于 Spark Streaming 的数据流实时计算 左继红](https://reader035.vdocuments.site/reader035/viewer/2022082207/5aaf70a67f8b9a22118d3b74/html5/thumbnails/42.jpg)