Download - DataCanvas: Big Data Analytic Flow in Cloud
![Page 2: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/2.jpg)
• 16.9B USD in 2015
• 40% Big data project
• Hadoop, CAGR 58%,
2.2B 2020
![Page 3: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/3.jpg)
• Volume
• Velocity
• Variety
Super hot in
• Government
• Communication
• Media
• Banking
• Manufacturing
![Page 4: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/4.jpg)
Technology
InfrastructureIAAS, SAAS, DAAS,
ApplicationBI, Social analytics,
visualization…
Domain solutionFinance, Retail,
Insurance
DevelopmentData scientist,
Devops
Business process
Operation, Support
![Page 5: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/5.jpg)
ANALYTICS IS THE
Make data live Data sitting in storage generates no value
Revenue and profit from data Application and solution to get insights from data Link insights with business Don’t stop at visualization or report
Advanced analytics is the engine of business solution Fraud detection Customer retention
![Page 6: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/6.jpg)
COMMON ANALYTICS SCENARIOS Data analysis
Example: Estimate customer’s life cycle value User: data scientist Demanding: flexibility to explore and faster iteration
Product analysis Example: How many female customers visit website home
page and leave within less than 5 clicks? User: product manager, data analyst, marketing team Demanding: No complex coding, SQL query at most
Predictive service Example: Is this transaction a fraud? User: developer and data scientist Demanding: pipeline processing
![Page 7: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/7.jpg)
WHAT DOES DATACANVAS ADDRESS Powering all these scenarios
Data Analysis: Flexible Product Analysis: Intuitive Prediction service: Complex processing
Enable application, solution and business process
DataCanvas
![Page 8: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/8.jpg)
Hadoop(HIVE/Pig) RDBMS NOSQL SPARK
Recommendation Anomaly Detection Operation Analytics
Application
Platform to enable application and connect infrastructure
Service
Pipeline
Infrastructure
![Page 9: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/9.jpg)
• Big data challenges are across services, environments and even locations
Storage
Processing
Reporting
Data Generation
• An orchestration platform is required to manage and connect steps in the pipeline
• Bring Pipeline to the game
![Page 10: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/10.jpg)
No more central data store, bring computation to data, not vice versa!
• Unify resource
• Optimize workload
• Automation
![Page 11: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/11.jpg)
Unmanageable
Redundancy
Hard to fast iterate
Gap between documentation and actual workflow
Pain points
monster configuration
spaghetti script no reuse No idea what’s actually running
![Page 12: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/12.jpg)
WHAT IS DATACANVAS
• Drag & drop to run data flow• Public or private cloud• Intuitive job management
• Module repository• Built-in library• Make your own recipe• Powering advanced analytics
• Business solution template• Address common applications• Fully customizable
• Team collaboration • Flow sharing • Module sharing• This is the BEST documentation
![Page 13: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/13.jpg)
VALUE
WorkflowScheduling
Module Solution Template
Operation Developer/Data scientist
Business
• Data ETL• Machine learning • Module repository
• Business requirement• Recommendation • Fraud detection • Sentiments analysis
• User experience• Production
quality• Easy ops
![Page 14: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/14.jpg)
WHY CONTAINER MATTERS• Seamlessly connect to any existing/
upcoming computation infrastructure
• Enabler for module management
and sharing
• Support Lambda: Processing +
Serving + Visualization
Lambda Architecture
![Page 15: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/15.jpg)
COMPETITORSAWS DP
Oozie AzureML MortarData
Azkaban DataCanvas
Workflow + Scheduling
Module management
Solution template
Multiple Env support
Collaboration + Sharing
Cloud service
DataCanvas = ((Workflow + Scheduler) * Drag & drop * Module composition ) ^ Solution @ Cloud
Good
Bad or not support
Not that great
![Page 16: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/16.jpg)
BUSINESS MODEL Subscription
Charge services on tiers, Startup, Premium, Enterprise
Free
• 1 user• Unlimited
projects• Limited
workload, good for evaluation
• Forum support
Startup
• Unlimited users• Unlimited
projects• Decent
workload, 3-5 jobs in parallel
• Email support
Premium
• Unlimited users• Unlimited
projects• Significant
workload, >20 jobs in parallel
• Email support
Enterprise
• Unlimited users• Unlimited
projects• Workload on
scale• Full support
Annual Support Package For Premier and Enterprise customers Forum support, Email support with SLA, Telephone support
![Page 17: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/17.jpg)
TARGET CUSTOMER Data scientist
Assembly line to facilitate exploration Team collaboration
Analyst Drag and drop to find insights, need any more reason?
Manager Faster iteration Shorter time to deliver project Easier to maintain
![Page 19: DataCanvas: Big Data Analytic Flow in Cloud](https://reader035.vdocuments.site/reader035/viewer/2022081401/558eacf01a28ab98708b46af/html5/thumbnails/19.jpg)
THANK YOU