logitech journey to the cloud - next generation data warehousing

17
BIG DATA TODAY Journey to the Cloud - next generation data warehousing Steven Perelli-Minetti Manager – Data Architecture, Logitech Avi Deshpande Principal – Big Data, Logitech

Upload: avinash-deshpande

Post on 15-Apr-2017

51 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Logitech journey to the Cloud - next generation data warehousing

BIG DATA TODAY

Journey to the Cloud - next generation data warehousing

Steven Perelli-MinettiManager – Data Architecture, Logitech

Avi DeshpandePrincipal – Big Data, Logitech

Page 2: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75552

Cloud empowers IT organizations to redefine the way data services are produced and delivered for Analytics.

more scalable … can reconfigure larger cluster in an hour

more efficient‒ can turn off over the weekend

‒ can clone prod for UAT and drop when done

more reliable‒ AWS automatically does 90% of what our DBAs did

Journey to Cloud

Page 3: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75553

Challenges of our traditional warehouse

On Prem Data Warehouse could no longer be extended to effectively address our evolving business needs:

Growing too fast for Exadata‒ smallest increase in any resource is a quarter rack

Difficult to set up and tune performance

Difficult to manage usage‒ Resources usage over time

‒ Queries … impact of each team, process

Page 4: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75554

Journey to the Cloud – DataWarehouse Architecture

AWS GlacierERP

POS

Scrapy

AWS S3

AWS Redshift

Tableau

Pentaho BA

Data Interfaces

Web ServicesD

enodoPentaho DI

MDM

Pentaho Operations Mart

RDSmysql

AWS - EC2DRM

SFDC GitHub

Cloudwatch

SNS

IAM Cloudtrail

Page 5: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75555

Producer and Consumer processes

Working in the cloud requires a different architecture to optimize use of cloud resources and services

Producers extract data and load to S3 by batch‒ Amazon Simple Storage Service (Amazon S3), provides developers

and IT teams with secure, durable, highly-scalable object storage

Consumers take a batch from S3 to load to Redshift

Asynchronous processes provide simple restart point‒ If Redshift is down, we continue to run producers to load S3 batches,

and restart consumers when Redshift is back up

Page 6: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75556

Data Producer Process (Source to S3)

Page 7: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75557

Data Consumer Process (S3 to Redshift)

Page 8: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75558

Template driven development

Templates provide consistent processes

Simplifies maintenance, enforcement of standards

Makes it easier to develop specs for offshore development

Supports faster development, testing, debugging

Page 9: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-75559

Pentaho PDI Template (Producer Transformation)

Page 10: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-755510

Pentaho PDI Template (Producer Job)

Page 11: Logitech journey to the Cloud - next generation data warehousing

Demonstration

template based development

Page 12: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-755512

Ops Data Mart supports better management

Collects data from our DI and BA servers‒ Collects process metadata over time

‒ Out of the box reports / dashboards

Provides meta data supporting validations‒ Raise a flag if today’s run outside of 2 std deviations of average

Provides history to see changes / trends over time‒ Raise a flag if job run time doubles in a month

‒ Raise a flag if report usage drops over time

Page 13: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-755513

Data Virtualization

Business Layer‒ Keep the business from trolling thru the backend db

‒ Data Consistency through single object, multiple consumers

Security thru Data Virtualization rather than every tool‒ Hard to keep security in synch across multiple analytic tools

Rapid Prototyping‒ Add new data source in DV layer first

‒ Move to Redshift / Pentaho after virtual analytics are validated

Page 14: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-755514

Benefits…

Proactive – IT has embraced cloud as a model for achieving innovation through increased efficiency, reliability and agility

Reusability and template development

Rapid innovation within governance structure, balanced costs, risks and service levels

Greater efficiency and reliability, enabling broader audience to consume IT services via self-service

Page 15: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-755515

What We Covered Today:

Pros / Cons of traditional solution Architecture

Benefits of moving to cloud

Benefits of template development

Benefits of Ops Data Mart

Benefits of Data Virtualization

Summary

Page 16: Logitech journey to the Cloud - next generation data warehousing

© 2015, Pentaho. All rights reserved. pentaho.com. Worldwide +1 (866) 660-755516

Want to learn more?

Amazon AWS - https://aws.amazon.com/

Data Virtualization https://en.wikipedia.org/wiki/Data_virtualization

http://www.denodo.com/en

Columnar databases - https://aws.amazon.com/redshift/

Pentaho DI and BA - http://community.pentaho.com/

Next Steps

Page 17: Logitech journey to the Cloud - next generation data warehousing

Thank YouJoin the Conversation

#PWorld15

<Avi - @Avinash49799752 ><Steve - @StevePerelliMin>