what’s new on the microsoft azure data platform
TRANSCRIPT
JUNE 25, 2015 | SLIDE 1
www.realdolmen.com
WHAT’S NEW ON THE MICROSOFT AZURE DATA
PLATFORM
JUNE 25, 2015 | SLIDE 2
#Name: Joris Poelmans #Function: Solution Architect
#Email:[email protected] #Twitter: jopxtwits
#Blog: jopx.blogspot.com
#Slideshare: www.slideshare.net/jplq631
Company: www.realdolmen.com
JUNE 25, 2015 | SLIDE 5
CONTOSO ELECTRONICS CASE
Who is interactingwith a product topurchase?
Which products to showcase?
How to automate customer support?
JUNE 25, 2015 | SLIDE 6
INSIDE THE STORE OF THE FUTURE
… but 73% think it’s
a plus when an online store also has an offline sales outlet
35% of Amazon purchases
based on personalized
recommendations, 75%for Netflix
DELIVER A PERSONALIZED EXPERIENCE WITH A HUMAN TOUCH
JUNE 25, 2015 | SLIDE 7
CUSTOMER CENTRICITY THROUGH DATA
Event / Data producers
Screen
interaction
In-Store Activity
Social Data
Ingest Transform Long-term storage
Cloud storage
Predictive Analytics
Predictive/ prescriptive analytics
Presentation and action
On premise
JUNE 25, 2015 | SLIDE 8
… POWERED BY MICROSOFT CLOUD
Event / Data producers
Web logs
In-Store Activity
Social Data
Ingest Transform Long-term storage
Azure SQL
Database & Azure
Storage
Predictive Analytics
Azure
Machine
Learning
Presentation and action
Azure Event HubsAzure Stream
AnalyticsAzure HDInsight Azure ML
On premise
JUNE 25, 2015 | SLIDE 10
Event / Data producers
Web logs
In-Store Activity
Social Data
Ingest
Azure Event HubsAzure Stream
AnalyticsAzure HDInsight Azure ML
JUNE 25, 2015 | SLIDE 12
EVENT VELOCITY
Device telemetry
Thermostats report data
every 15 minutes
Cars send telemetry data every minute
Application telemetry
Application performance counters are measured
every second per server
Mobile app telemetry is captured for
every action on your app!
Application and operational events
JUNE 25, 2015 | SLIDE 13
AZURE EVENT HUBS
Event Producers
HTTPS
AMQP 1.0
Throughput Units:
• 1 ≤ TUs ≤ Partition Count
• TU: 1 MB/s writes, 2 MB/s reads
Event Producers
AMQP 1.0
JUNE 25, 2015 | SLIDE 15
… POWERED BY MICROSOFT CLOUD
Event / Data producers
Web logs
In-Store Activity
Social Data
Ingest Transform
Azure Event HubsAzure Stream
AnalyticsAzure HDInsight Azure ML
JUNE 25, 2015 | SLIDE 16
DATA AT REST
DATA AT REST DATA IN MOTION
SELECT COUNT(*) FROM
PARKINGLOT
WHERE type=‘CAR’
AND color=‘RED’
?
JUNE 25, 2015 | SLIDE 17
TRACK SHELF INVENTORY IN REAL TIME
1 Track your products through shelf sensors,
RFIDs, price tags, or Wi-Fi Way Finding. As
inventory is removed from the store shelf, the
store info updates in real time.
2 Configure the system to notify an employee to
restock when inventory drops below a
preconfigured range.
4 Track inventory end-to-end; from the
manufacturer, through shipments and
stocking, to the floor, and to sale.
3 Access real-time inventory data
on your devices.
JUNE 25, 2015 | SLIDE 18
Intake millions of events per secondProcess data from connected devices/apps
Integrated with highly-scalable publish-subscriber ingestor
Easy processing on continuous streams of data Transform, augment, correlate, temporal operations
Detect patterns and anomalies in streaming data
Correlate streaming with reference data
JUNE 25, 2015 | SLIDE 19
AZURE STREAMING ANALYTICS
No hardware acquisition
and maintenance
No software provisioning
and maintenance
Up and running in a few
clicks
Elasticity of the cloud for
scale up or scale down
Low startup cost
Built in monitoring
SQL like language
available to create stream
processing solutions
Development and
debugging experience
through Azure Portal
Integrated with
EventHub, Azure Blobs
and Azure SQL DB
JUNE 25, 2015 | SLIDE 22
Event / Data producers
Web logs
In-Store Activity
Social Data
Ingest Transform Long-term storage
Azure SQL Database & Azure Storage
Azure Event HubsAzure Stream
AnalyticsAzure HDInsight Azure ML
On premise
JUNE 25, 2015 | SLIDE 23
HADOOP AND MODERN DATA ARCHITECTURE
Apache Hadoop is an
open source framework that supports
data-intensive distributed applications
Uses HDFS storage to enable applications to
work with 1000s of nodes and petabytes of data
using a scale-out model
Uses MapReduce to process data
Inspired by Google
MapReduce
Google File System
Related projects:
HBase, Hive, Mahout, Pig,Sqoop, Ambari, Storm,
Zookeeper, ... And many more
JUNE 25, 2015 | SLIDE 24
AZURE HDINSIGHT
Data Node Data Node Data Node Data Node
Task Tracker Task Tracker Task Tracker Task Tracker
Name Node
Job Tracker
HMasterCoordination
Region Server Region Server Region Server Region Server
Pay for what you use
Use Azure Blob storage
Extend with HBase as a columnar NoSQL transactional database
Support for additional Apache projects such as Storm and Mahout
JUNE 25, 2015 | SLIDE 25
Telecommunications Financial Services Health Care Industry/Utility
HOW ORGANIZATIONS ARE USING HADOOP
Churn prediction, CDR
analysis, network
monitoring, next best
offer, …
Customer 360°, fraud
detection,
Clinical trial selection,
patent mining,
personalized medicine,…
Predictive maintenance,
supply chain and
inventory optimization,
smart metering,…
JUNE 25, 2015 | SLIDE 28
Event / Data producers
Web logs
In-Store Activity
Social Data
Ingest Transform Long-term storage
Azure SQL Database & Azure Storage
Predictive Analytics
Azure Machine Learning
Presentation and action
Azure Event HubsAzure Stream
AnalyticsAzure HDInsight Azure ML
On premise
JUNE 25, 2015 | SLIDE 30
• Formal definition: “A computer program is said to learn from
experience E with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by P,
improves with experience E” - Tom M. Mitchell
• Another definition: “The goal of machine learning is to program
computers to use example data or past experience to solve a given
problem.” – Introduction to Machine Learning, 2nd Edition, MIT Press
• ML often involves two primary techniques: – Supervised Learning: Finding the mapping between inputs and outputs using
correct values to “train” a model
– Unsupervised Learning: Finding patterns in the input data (similar to Density
Estimates in Statistics)
DEFINING MACHINE LEARNING
JUNE 25, 2015 | SLIDE 31
Vision Analytics
Recommenda-tion engines
Advertising analysis
Weather forecasting for business planning
Social network analysis
Legal discovery and document archiving
Pricing analysis
Fraud detection
Churn analysis
Equipment monitoring
Location-based tracking and services
Personalized Insurance
JUNE 25, 2015 | SLIDE 32
PREDICTIVE ANALYTICS/MACHINE LEARNING
Developing predictive analytics must be simpler, today it requires specialized skills:• Data management• Data exploration• Math & statistics• Domain expertise• Machine learning• Software development• Data visualization
65% of enterprise feel they have a strategic shortage of data scientists, a role many did not know existed 12 months ago …
JUNE 25, 2015 | SLIDE 42
3 days 10 days 15 days
Goal Architecture Architecture + basic POC
Architecture + POC
Pre-engagement questionnaire √ √ √
Define basic architecture √ √ √
Refine architecture √ √
Basic POC* (Project Setup, Mobile Services, Web API, Identity, Scalability, Azure Machine Learning, HDInsight)
√ √
Extended POC (possible topics a.o. Integration with backoffice system, Notifications, Search, Azure Machine Learning, HDInsight, …) *
√
*: scope for POCs to be discussed/Usage costs of Azure will be billed separately
AZURE POC OFFERING
JUNE 25, 2015 | SLIDE 43
Seamless
Integration
Scalable
ProcessingActionable
Insights
• Predictive,
Prescriptive &
Cognitive Analytics
• Data Mining &
Machine Learning
• Enriched Applications
• Distributed Data Platform
• Processing Engine
• Hybrid Architecture
• Interaction &
Transaction Data
• Existing & Emerging
Data Sources
• Open Data sets &
Commercial Data Providers
1 2 3
END-TO-END SMART DATA SOLUTIONS