wasup q4 2018 29 november - sas institute...4:35 the power of analytics and machine learning in...
TRANSCRIPT
WASUP Q4 201829 November
Agenda4:30 Welcome Jonathan Butow
4:35 The Power of Analytics and Machine Learning in
Automated Traffic Enforcement David Slack-Smith
5:00 Machine Learning for Safety Peter Condon
5:30 New Capabilities. New Opportunities Jonathan Butow
5:55 Wrap Up Hanlie Erwee
6:10 Networking
The Power of Analytics and Machine Learning in Automated Traffic Enforcement
• Road Safety Council and the Road Safety Commission are responsible for the State’s road safety strategy, Towards Zero 2008 – 2020.
• The strategy aims to reduce those people killed and seriously injured on WA roads by 11,000 by 2020.
• Initiatives and investments are guided by the four cornerstones of safe speeds, safe vehicles, safe road use, safe roads and roadsides.
• Primary source of information is crash data.
Background
• Crash data is often of poor quality, and requires a significant amount of domain expertise to interpret correctly.
• Significant time is spent by analysts manipulating data.
• Furthermore, much of the more specific analysis e.g. vehicle information, requires matching crash records with those held by other agencies.
• Because of this complexity, being able to answer information requirements clearly can be a challenge.
Background
Case Study –Automated Traffic Enforcement
• Automated traffic enforcement is a foundation of the safe speeds and safe road use cornerstones.
• There are a mix of assets deployed under this program, mobile, fixed and red light speed.
• Rigorous and thorough site selection is critical to ensure these assets are deployed as effectively as possible.
• Site selection reports are reviewed and endorsed by a steering committee with members at the executive level.
Automated Traffic Enforcement
• All WA intersections are ranked using a weighting methodology developed by Monash University Accident Research that considered all crash types and severities.
• Crashes are weighted based on severity with all natures included.
• This list is then refined by WA Police and Main Roads WA based on the presence of an existing camera, suitability and feasibility.
Current State
• Minimal data prep outside the platform, aside from one merge
• Data prep inside the platform consisted of some date formatting and then deriving a number of variables within the VA module.
• Next step was to explore the data. We created a simple visualisation to assess the quality of the data and to provide a point of reference for later model building.
Model Building
• The first model we built was a logistic regression model
• We had some initial issues due to missing values being excluded
• The data was partitioned in 40/60 validation/training split.
Model Building
• Next we created a decision tree model, using the same predictors and a random forest model, which was not terrible successful in predicting.
• We also used a gradient boosting model
• Once we had created the models we created our model comparison, using the model comparison object, which selected the logistic regression as the champion model.
Model Building
• Based on the logistic regression model we created a new variable to assess the probability of KSI occurring
• We then ranked intersections based on the sum of these probabilities to find the most ‘risky’ intersections.
• This identified a number of intersections that we had previously not identified.
• A reduction factor based on research from Monash University was applied to the total number of crashes from the new and old lists.
Model Building
• The savings generated by the modelling were the same as the existing method.
• However, the ease of reproduction has already resulted in a significant time saving to the Commission, in addition to being better communicated.
• To better assess risk, the model will be expanded to include volume data and red light contravention to improve accuracy.
Conclusions and Next Steps
Agenda4:30 Welcome Jonathan Butow
4:35 The Power of Analytics and Machine Learning in
Automated Traffic Enforcement David Slack-Smith
5:00 Machine Learning for Safety Peter Condon
5:30 New Capabilities. New Opportunities Jonathan Butow
5:55 Wrap Up Hanlie Erwee
6:10 Networking
Machine Learning for safety
• Build, maintain & operate transmissionand distribution assets: South WestInterconnected Network (SWIN)
• 1.1+ million customers• ~264,000 street lights• ~237,800 solar PV installations*• ~570 battery systems*
~102,000km circuit wire
254,920 km2
The
SWIN
~861,000poles & towers
17,047GWhelectricity transported
* As at 31/5/18
About Western Power
Power Generation Retailer
Western Power
Our Customers
What we do
Network evolution is reliant on community behaviour, technology advancement rates, regulation and policy
Current SWIS model
Future model with small number of islanded
systems
Extreme model without centralised network
Future model with variable network
types
Integrated Network
Fringe Disconnection
Modular Network
Fully Decentralised
BRANCH NETWORK
MESH NETWORK
MICROGRID STAND-ALONE POWER SYSTEM (SPS)KEY
1. Problem definition
2. Key inputs
3. Data not considered
4. Aggregation points
5. Modelling process
6. Partial Dependence plots
Contents
Problem definition
• Western Power has an outstanding safety record
• Understanding incidents is considered an effective way to improve, and we want to consider the data differently
• Project goal is to build a model to predict who is likely to be involved in a near miss each day
• Information will be used to our existing safety strategy
• Incidents
• Timesheets
• Weather
• Position history
• Date
Key inputs
• Contractor and office based staff near misses
• Personal details
• Telematics
Data not considered
• Individuals are too unpredictable with available data
• Work locations (depots) are too coarse to be useful
• Team (primary reporting code) provide a natural aggregation point for people who typically work closely together
Aggregation points
• Random Forest was used due to past performance on similar models
• SAS Enterprise Miner automatic variable selection
• Receiver Operator Characteristic (ROC) curve on validation data used to select champion model
• Test dataset used to validate performance of champion model
• Champion model retrained on full dataset
Modelling process
• Completely black box models aren’t acceptable
• Partial Dependence plots iteratively rescore model with a single variable set to each possible value
• Results shows the influence of single variable on the outcome
Partial Dependence plots
• Frontline leaders need more detailed explanations than Partial Dependence plots can provide
• LIME can explain the results of any predictive model
• Iteratively make small changes to part of the input, rescoring the model, and fitting a linear model on the results
Local Interpretable Model-Agnostic Explanations (LIME)
Head office363 Wellington StreetPerth, WA 6000westernpower.com.au
Agenda
4:30 Welcome Jonathan Butow
4:35 The Power of Analytics and Machine Learning in
Automated Traffic Enforcement David Slack-Smith
5:00 Machine Learning for Safety Peter Condon
5:30 New Capabilities. New Opportunities Jonathan Butow
5:55 Wrap Up Hanlie Erwee
6:10 Networking
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
New Capabilities. New OpportunitiesNew SAS and what it means for you?
Jonathan Butow
Advanced Analytics Innovation
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
• Thank you
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
New Dynamic Market Place
1976 2018
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
1997
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
2002
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
2004
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
2005
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
2010
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
2014
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
2016
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Insight Discovery& Model Build
Model Governance& Deployment
Cloud NativeDeployment &Architecture
Support forModernisation
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Support for modern ML/DL techniques
• Decision Trees• Generalized Linear Models
• K-means, K-modes, K-Prototype Clustering• Linear Regression• Logistic Regression• Nonlinear Regression
• Generalized Additive model• Non-parametric Logistic Regressions• Ordinary Least Squares Regression• Partial Least Squares Regression• Principal Component Analysis
• Quantile Regression• Factorization Machines• Gradient Boosting• Random Forest
• Support Vector Machines• Convolutional Neural Networks• Recurrent Neural Networks
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Model Interpretability
• Partial Dependence (PD)
• Individual Conditional Expectation (ICE)
• Local Interpretable Model-agnostic Explanations (LIME)
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Model Deployment and Publishing
Building models is academic.Deploying them is economics.
Gartner 2018
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Automated Model Deployment Options
BATCH SCORING REAL/NEAR-REAL TIME SCORING
Base SAS
In-Database
CAS API
Event Stream Processing
CAS API
MAS/REST API
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
SAS® CloudDescriptions
Managed Services
Your software or infrastructure can be hosted or remotely managed by SAS experts 24/7.
Results-as-a-Service
Give us your data and problem, and we give you the answers on which you can take action.
Software-as-a-Service
Off-the-shelf offerings designed to scale and fit for purpose. Sign up, log in, and get to work. Can be modified to your future needs.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
• General policy on virtualized environments
• Connectivity to cloud native data repositories (S3, SQL Server on Azure)
• SAS Analytics for Containers• SAS Analytics for Containers on
SAS Viya*• SAS support of Docker Containers
and Kubernetes Orchestration• SAS support of Cloud Foundry
• Amazon: AWS Quick Start*• Microsoft Azure• Oracle Cloud• Google Cloud Platform• Managed Analytic Service
Providers…
SAS® on Cloud ProvidersExample Offerings
SAS® on Cloud Providers
General Support for Cloud Providers
Cloud Deployment Patterns Cloud-Specific Offerings
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s
r e se r v e d.
New Opportunities
Modernise your current SAS implementation to appeal to broader audience
Tackle new analytics requirements using SAS Deep learning/Gradient Boosting algorithms
Get to production and deploy your model quicker by leveraging automated REST API model deployment
Reduce overall project complexity and time to market by leveraging SAS cloud solutions and support
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Copy r i g ht © S A S I ns t i t ut e I nc. A l l r i g ht s r e se r v e d.
Agenda
4:30 Welcome Jonathan Butow
4:35 The Power of Analytics and Machine Learning in
Automated Traffic Enforcement David Slack-Smith
5:00 Machine Learning for Safety Peter Condon
5:30 New Capabilities. New Opportunities Jonathan Butow
5:55 Wrap Up Hanlie Erwee
6:10 Networking
Wrap Up
Copy r i g ht © S A S I nst i t ut e I nc. A l l r i g ht s r e se r v e d.
Best Presentation Award Winner Q2 2018
Dean HiniSAS Melbourne User Group
(SMUG)
Wrap Up• Thank you
• Presenters• Committee• Audience
• Survey – Please complete & hand back for Lucky Draw
• Another Survey coming your way!• Mid December till end January 2019• Revamping user groups – we need your input• Win one of 4 Google Home Hubs
• Presentations available on User Group website
• Lucky Draw
Lucky Draw
Please join us for snacks and drinks