analyzing billions of data rows with alteryx, amazon redshift, and tableau
TRANSCRIPT
Analyzing billions of rows with Amazon Redshift, Alteryx and Tableau
May 3, 2016
Today’s presenters
Adrian LoongNational Manager, RXP Services,
Analytics Consultant& Alteryx ACE
Brandon ChavisSolutions Architect,
Amazon Web Services
Raman KalerAlliance Marketing
Manger, Alteryx
Download a FREE Trial: www.alteryx.com/alteryxfortableau
© 2016 Alteryx, Inc. | Confidential
Alteryx - the leading platform for self-service data analytics
Prep, blend, and analyze all your data
using a repeatable workflow
Deliver deeper insights
in hours, not weeks
Deploy and share analytics at scale
© 2016 Alteryx, Inc. | Confidential
Download a FREE Trial: www.alteryx.com/alteryxfortableau
Share
The leading platform for self-service data analytics
Enrich
Output All Popular Formats
Prep & Blend Analyze
Input All Relevant Data
© 2016 Alteryx, Inc. | Confidential
Download a FREE Trial: alteryx.com/trial
5
Emerging Market (>$1B)Fast Moving (>50% growth)
Business User Focused
Data Prep & Blending
Tiny (<$100M)Middling Growth
Niche, Open Source
Data Science Operational ETL
Large Market (>$5B)Slow Moving (<5% growth)
IT Driven
Three Data Integration Markets
Relational data warehouseMassively parallel; Petabyte scaleFully managedHDD and SSD Platforms$1,000/TB/Year; starts at $0.25/hour
Amazon Redshift
a lot fastera lot simplera lot cheaper
What is Amazon Redshift?
Amazon Redshift is easy to use
• Provision in minutes
• Monitor query performance
• Point and click resize
• Built in security
• Automatic backups
Amazon Redshift architecture
Leader NodeSimple SQL end pointStores metadataOptimizes query planCoordinates query execution
Compute NodesLocal columnar storageParallel/distributed execution of all queries, loads, backups, restores, resizes
Start at just $0.25/hour, grow to 2 PB (compressed)DC1: SSD; scale from 160 GB to 326 TBDS2: HDD; scale from 2 TB to 2 PB
SQL Clients/BI Tools
128GB RAM
16TB disk
16 cores
Ingestion/BackupBackupRestoreAmazon S3/Amazon
DynamoDB/SSH
JDBC/ODBC
10 GigE(HPC)
128GB RAM
16TB disk
16 coresCompute Node
128GB RAM
16TB disk
16 coresCompute Node
128GB RAM
16TB disk
16 coresCompute Node
LeaderNode
The Amazon Redshift view of data warehousing
•10x cheaper
•Easy to provision
•Higher DBA productivity
•10x faster
•No programming
•Easily leverage BI tools, Hadoop, Machine Learning, Streaming
•Analysis in-line with process flows
•Pay as you go, grow as you need
•Managed availability & DR
Enterprise Big Data SaaS
The legacy view of data warehousing ...
Global 2,000 companiesSell to central ITMulti-year commitmentMulti-year deployments Multi-million dollar deals
… Leads to dark data
This is a narrow view
Small companies also have big data(mobile, social, gaming, adtech, IoT)
Long cycles, high costs, administrative complexity all stifle innovation
1990 2000 2010 20200
200
400
600
800
1000
1200
Enterprise Data Data in Warehouse
Please note:
All data used for this presentation is sample data only. It is used to demonstrate the capability of AWS,
Alteryx and Tableau, and no actual amaysim data will be shared.
1
2
3
4
Introduction
Business challenges
Best of breed BI stack - AWS Redshift, Alteryx & Tableau
Learnings
amaysim is an Australian mobile virtual network operator established in November 2010.
About me
Adrian Loong – Business Intelligence manager
• Responsible for analytics strategy and execution
across Finance, Retail Sales, Marketing and HR
• Expertise:
• Decision support / Management consulting
• Visualisation, Financial analytics, Budgeting &
Forecasting
• CPA
.
About amaysim
Strategy - Data driven decisions in real time
1. Knowledge is power
2. Smart management = happy customers & happy shareholders
3. Get a real competitive edge - stop looking back and start predicting the future
4. A 3 year rolling forecast in real time – generate real company wealth by being reliable
How Amaysim has benefited from their analytics program
Workforce Productivity• We have a 3 person analytics team covering wide span of functions (Finance, Marketing –
Customer retention & acquisition, Sales – Retail and Online, Data warehousing, HR)• By enabling line of business users to quickly build on a baseline of analytics, they can easily
solve their own specific business problems quickly and do not have to wait on Business intelligence teams
Reduced time to insight• We are able to get the data we need faster. Projects that would have taken 2-3 weeks are now
down to a day
Data driven decision making• Line of business users are able to get direct access to their own data in an easy visualization
enabling them to able to solve problems faster. • People can look at the data, do some discovery, and then arrive at an answer
1
2
3
4
Introduction
Business challenges
Best of breed BI stack - AWS Redshift, Alteryx & Tableau
Learnings
amaysim has a wide variety of data sources and a lot of data being generated daily
Wide variety of data sources A lot of data is generated daily
Source Systems
Phone call Internet data trafficSMS
Over 10 billion rows of data with nearly 20 million rows added daily
Databases
Competition is intense in the mobile sector
Major carriers Mobile Virtual Network Operators
We only have a small team covering a wide span of functions
Retail Sales
Finance
Customer acquisition
HR
Photo Source: http://www.timeshighereducation.co.uk/news/business-schools-not-first-port-of-call-for-managerial-recruits/2013863.article
Customer retention
Traditional tools are unable to keep up with speed and velocity of business demands, tools like redshift, tableau and alteryx enable analysis within hours
Traditional BI : Keep calm and please wait Analysis at the speed of thought
VS
We need tools that empower us to analyse at the speed of thought
1
2
3
4
Introduction
Business challenges
Best of breed BI stack - AWS Redshift, Alteryx & Tableau
Learnings
Business intelligence stack
Source Data IT ELT/ETL
Data Warehouse
Business ETL
Visualizer & Reporting
• ECC• CRM• CSC• Google Analytics• Sales POS• The list goes
on…
• Alteryx Desktop selected & deployed
• Gives our business the agility we need
• Business driven rather than IT driven
• Ability to do predictive analytics
• Tableau Desktop & Server selected and deployed
• Self serve BI• Business dashboarding• Ability to do predictive
analytics
Time to load Reliability of Data Query performance
Flexibility for slice/dice Visualization performanceData definitions / governance Data exploration
• CDC used for real time data replication
• Slower to build• More reliable
• Redshift selected for reliability, processing power and scalability
Redshift provides the speed and robustness to store and analyze vast volumes of data. Alteryx fuels the Tableau visualisation allowing us to quickly gain insights in Tableau
Source
system 1
Livechat
Source
system 2
Zendesk
Amazon Redshift Datamart
External data / Marketing data
Blend & enrich Visualization
• Clean the data• Apply
business rules• Validate business
rules in teams
3
4
2
1
Comments
Tableau can be used to directly visualise and analyse big data directly from Redshift
1
Alteryx can be used to blend data that isn't in the Redshift database
2
Alteryx can apply & validate more complex business rules before visualizing outputs in Tableau
Continuously iterate between Alteryx and Tableau when discovering more data
4
2
3
1
3
4
4
Alteryx enables us to combine different data sources in a visual workflow before visualisation in Tableau
Server
Source systems Databases
Servers
Workstations
Tableau server Tableau dashboards
Tableau is excellent for collaboration
1
2
3
4
Introduction
Business challenges
Best of breed BI stack - AWS Redshift, Alteryx & Tableau
Learnings
Democratize the opportunity
Source: http://www.heynataliejean.com/2009/01/end-of-nerd-era.html
Most business leaders want to use data to make better decisions..
• You don’t have to be an IT specialist to use Alteryx
• Give users access to the tools, use Alteryx & Tableau to show them how much faster you can spot a trend
• 1:1 Training sessions for different stakeholders tailored to their needs
Make it relevant to different stakeholders
Senior management • More interested in insights, dashboards and outcomes• Spend time showing them dashboards built and how opportunities to improve
business performance (eg: revenue generation, expense reduction).
Functional specialists & leaders• Alteryx relevant to functional leaders• Alteryx to “clean” the data before visualizing in Tableau• How can it be used to streamline processes
Celebrate success : Keep building and Iterating
Existing tools require analysts to have “coding” experience
Few disparate analysts using SQL to deliver analytical solutions
Where we are now
Using ‘Best’ of Breed products• Visualization : Tableau • Data blending: Alteryx• Warehousing : Redshift (WIP)
Using Alteryx & Tableau to deliver :• Revenue assurance• Sim Sales• Customer disconnections &
churn• Port-outs by carrier• CDR analysis
Where we will be
Self serve analytics for all business users• Real time P&L with Slice and
dice with visualization• Assurance over logic
processing (Audit, Commissions)
Advanced predictive analytics • Churn propensity• Customer behavior
Where we started
1. Democratize the opportunity
2. Make it relevant to different stakeholders
3. Cultural change takes time
32
Next steps:Get the Alteryx Starter Kit:
www.alteryx.com/alteryxfortableau