crimson 3 - final case presentation

62
The Future of Cummins Data Warehousing Architecture and Strategy Pragnya Balamurukesan Graham Cenko Michael Khamis Pavithra Thevasenapathy 1 Crimson 3

Upload: pragnya-balamurukesan

Post on 15-Apr-2017

37 views

Category:

Documents


0 download

TRANSCRIPT

1

The Future of Cummins Data Warehousing Architecture and Strategy

Pragnya BalamurukesanGraham CenkoMichael KhamisPavithra Thevasenapathy

Crimson 3

2

Agenda

Crimson 3

Our Understanding

Data Warehousing Trends

Recommendations

Risks and Mitigations

Financials

Implementation Timeline

Conclusion

Our Understanding

Crimson 33

Cummins has six Data Warehouses on the

Oracle Exadata platform, a Data Lake

environment in Hadoop and a Teradata

warehousing appliance, which are not integrated

The current Data Warehouse architecture

and strategy does not meet the business

intelligence or future needs of the company

What Data Warehouse architecture and

strategy would meet Cummins’ needs and

support future growth initiatives?

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Future trends that should be incorporated into Cummins’ Data Warehousing strategy

Crimson 34

Cloud

Data Warehous

e

Business Intelligenc

e Tools

Big Data

Big Data Analytics

Hadoop Platform

Real-Time

DataStreaming

Analytics & Reporting

Consolidation

Physical

Logical

Foley, John. “The Top 10 Trends in Data Warehousing.” Forbes. 10 March 2014Satell, Matt. “The Future of Data Warehousing: 7 Industry Experts Share Their Predictions. BetterBuys. 5 November 2014

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Cummins should adopt this Data Warehouse architecture to satisfy future trends and growth initiatives

Crimson 35

Cloud Files Office files Web services Social Feeds Sensor Web logs

Data Sources

Enterprise Information Management BPM ECM CEM Discovery Info exchange

Data Warehouse Hadoop

Stream Computing

Master Data Management

Data Virtualization

Reporting Statistical analysis Visualization

Business Intelligence Tools

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Cummins should take these five actions to achieve the recommended Data Warehouse architecture

Crimson 36

Governance

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Move certain databases from Oracle Data Warehouse to Teradata Active Data Warehouse

Private Cloud

Implement Hadoop-as-a-Service using Google Compute Engine and MapR

Adopt Cisco Composite Data Virtualization Platform

Add IBM InfoSphere Stream, Tableau and Spotfire to the Business Intelligence & Analytics

tools

Crimson 37

TERADATA ADW PRIVATE CLOUDEDW

Components Power Gen

Engine Distribution

Active events Customer-sales representative interaction, worker in

shipping & receiving

Active load Arrival of damaged critical supplies

Active enterprise integration Fitting into existing portals, Web services, SOA

components

Active workload management Controlling mixed workloads

Active availabilityIncreasing the DW availability from business critical to

mission critical

Active access Out-of-stock situation, inventory manager makes

decisions

ORACLE

Corporate Components

Engine

Power Gen

Distribution

Supply chain, Logistics, Sales, Marketing, Inventory & Operational data

Cummins should move certain Databases from Oracle Exadata to Teradata Active Data Warehouse Private Cloud

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

BENEFITS

Teradata(2015) “Enabling the Agile Enterprise with Active Data Warehousing”

Cummins should adopt Teradata private cloud for the following reasons

Crimson 38

Challenges in Public Cloud

Worldwide private cloud adoption- Forbes

Consolidate to Teradata private ADW

Reduced costs through server utilization

Pay what you use ,when you need

Faster less than five minutes

Elastic performance

Quick decision making

Leading Healthcare company saves 4.3 billion, delivering

250,000 self service reports, improving

performance by 10x

Government agency which took 20

hours for running queries can run in

15 minutes

Why private cloud model ?

• High Active Performance

• Effortless Scalability

• Operational Availability

• Enterprise Concurrency

• Investment Protection

Success stories

Characteristics of Teradata ADW private cloudBenefits of Teradata ADW private cloud

Teradata News Release (2012) Teradata Active Data Warehouses Provide Private Cloud Benefits

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

Cummins should implement Hadoop-as-a-Service using Google Compute Engine and MapR

Crimson 39

Google Cloud Storage

MapRMapR CLDB

(Container Location Database)

<cluster> [Master] MapRMapR FileServer

<cluster> 000 [Worker]

<cluster> 001 [Worker]

<cluster> nnn [Worker]

MapRMapR FileServer

MapRMapR FileServer

1

1 An application downloads data file from Google Cloud Storage and pushes it MapR-FS2

2 The CLDB distributes the file to MapR-FS based on the query

3

3 The result of the query is written to the file on Google Cloud Storage

DATA FLOW

FEATURES12345

Operational IntelligenceEnterprise Data HubInternet of ThingsSecurity and Risk ManagementMarketing Optimization

MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

10

Cummins should implement Hadoop-as-a-Service using MapR for the following reasons

Crimson 3

Cost Scalability

Enhanced productivity Collaboration

Elasticity Efficiency

MapR Cloudera Hortonworks

Data Ingest Batch and streaming writes

Batch Batch

Hbase Performance

Consistent low latency

Latency spikes Latency spikes

High Availability

Self healing across multiple failures

Single failure recovery

Single failure recovery

Replication Data + metadata

Data Data

File I\O Read/write Append only Append only

Write level authentication

Kerberos, Native

Kerberos Kerberos

Vendor

Criteria

Robert D. Schneider (2014) “Hadoop Buyer’s Guide, Ubantu”

Why we chose cloud deployment ?

Why we chose MapR ?

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

Cummins should implement Composite Data Virtualization Platform to provide a unified logical view of all the data

Crimson 311

OperationalStores

SaaSApplications

Data Warehousesand Marts

Data Virtualization PlatformAbstra

ct Federate Cache

CacheOptimizer

Discovery

Traditional, Big data & cloud sources

Cisco Information Server

InstantAccess to all data

End-End data management

Faster response to BI & Analytics

Features

BI & Analytic tools

Logical view of Cisco Composite

Unified logical enterprise view of all the data

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

David Bescmer. Jan 2014. Cisco Data Virtualization

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

12

Cummins should install Composite Data Virtualization Platform for the following reasons

Crimson 3

Composite Informatica IBM

Federated Query language

3 2 2

Caching 3 2 2

Profiling 3 1 2

Metadata support

3 1 1

Customer base 3 2 2

Compatibility with existing technologies

3 2 2

Total 18/18 10/18 11/18

VendorCriteria

Profit Growth

Risk Reduction

Technology Optimization

Staff Productivity

Time-to-Solution Acceleration

Benefits of Virtualization

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

Cisco “Data Virtualization”

13

Cummins should reevaluate their existing BI Toolset and purchase Tableau and Spotfire for visualization and analytics

Crimson 3

Existing - Reporting• Action: Continue Using

OBIEE and MSBI for reporting. Phase out the other four traditional platforms

• Benefit: Reduced licensing and training costs, standardized reports and less complexity

Tableau - Visualization• Action: Purchase

Tableau Online for an easy to use data visualization platform that is designed for end business users

• Benefit Enables self-service BI to the entire organization, no support from IT needed

Tibco Spotfire – Statistical Analysis• Action: Purchase Tibco

Spotfire Platform for advanced analytical capabilities to be used by business analysts

• Benefit: Predictive and Prescriptive analytical capabilities and ability to consume structured and unstructured data

Tibco Software Company. “Tibco Spotfire Platform.” 15 December 2015Tableau. “Tableau Online.” 15 December 2105

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

14

Cummins should adopt IBM InfoSphere Streams to enable real time business intelligence

Crimson 3

Avadhoot Patwardhan (2015) “Introduction: Real-Time Analytics on Data in Motion”Aladdabigdata (2015)Real-time Analytics using IBM InfoSphere Streams

ACQUIREReal time data from

several different streams having different formats

ANALYZEThe data in real time

using applications developed by either

Cummins or IBM

ACTOn the Business

Intelligence delivered in real time

Integrated Development Environment Scale – Out Runtime Analytic Toolkits

Benefits of Stream Computing

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

15

Cummins should establish the following teams for effective governance over the Data Warehouse initiative

Crimson 3

Change Management

• Comprised of senior managers and supervisors of each business unit

• Communicate change to the company and each business unit

• Manage training of employees

Vendor Management

• Comprised of Cummins IT professionals

• Assigns tasks to vendors while monitoring the performance of each vendor

• Re-negotiating contracts

Support Team

• Comprised of Cummins IT technicians for each business unit

• Groups will be assigned to each layer of the architecture

BICC Team

• Comprised of business managers from each business unit

• Champion BI technologies defining standards, business alignment, project prioritization and management

Information Governance

• Comprised of C-suite member, IT professionals, business managers, paralegal, and members from each business unit

• Manage information throughout its lifecycle

IT Steering CommitteeBusiness & IT Leaders

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Rec 1 Rec 2 Rec 3 Rec 4 Rec 5

16

It will take 3 years for Cummins to implement the recommended Data Warehouse strategy

Crimson 3

Year 2Year 1 Year 3

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

The project will costs Cummins $11,370,000 and result in the following benefits

Crimson 317

Emission controlUsing real time data to

track emission of engines, Increasing the quality of

Cummins engines

Investment in the right technologiesUsing BI tools to predict where market trends in engine technology are headed

Leading projects in major markets Using BI tools to improve alignment with organization strategy

BenefitsBusiness Value is derived from the actions taken as a result of the analysis enabled by

the BI tools Cost Savings: ~$2 million

Cloud storage, Operating Expense, and People

Software Hardware

Cloud StorageTools

End user TrainingCost of AdministrationMaintenance Support

External ContractTotal Costs

$ 1,400,000$ 675,000$ 65,000$ 5,750,000$ 200,000$ 200,000$ 2,680,000$ 400,000$ 11,370,000

*See appendix for detailed cost description and more sources

CostGlobal expansion

Using BI tools to find existing and potentially new areas with demand

that is not being exploited

Potential Business Value Benefits

Sallem,Rita. Sept. 2012, “Customer rate their BI /vendors on Costs.”Sheffield, Glen. March 2015, “How much does Teradata warehouse Cost.”

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

18

Risks and Mitigations

Crimson 3

Risk Mitigation

Data maybe breached when we store it in the Teradata cloud Teradata is partnered with Protegrity and utilizes Tokenization technology which is applied to data before entering into the warehouse

Data virtualization Cisco platform can bring up data security concerns because the all the business data is used by this platform

1.The manager that resides in the Cisco Information Server takes care of security, metadata , source code and more. 2.The IT security team of Cummins will be given training on the new security policies and data governance, data standards. 3. Change management team will make sure that there is effective communication between the vendor management, in-house IT teams and C-suite level about security measures

The data stored in Google Compute Engine or being used by MapR’s services maybe breached

MapR is equipped with authentication mechanisms (Kerberos, Native), authorization mechanisms (Access Control Expressions, Unix File Permissions, Access Control Lists) encryption mechanisms (Over-the-Wire Encryption, Encryption at Rest, Field-Level Encryption, Format-preserving Encryption and Masking) and governance guidelines

Employees responsible for reporting, visualization or analytics may become dissatisfied while learning new tools

Reporting tools will remain the same and it will be the Change Management Team’s responsibility to set the tone from the top

Inconsistent data from legacy systems will remain in the new Data Warehousing Architecture

Information Governance Team and MDM tool will ensure consistent and reliable data across platforms and databases

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Teradata. “Our Partners.” 2015MapR (2014) “MapR, Hive, and Pig on Google Compute Engine

19

Following these recommendations will lead to a successful data warehouse architecture that has the capabilities to allow users to make intelligent business decisions

Crimson 3

Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion

Data Warehouse architecture and strategy that meets

business needs and future

trends

Move certain Databases from Oracle to Teradata Active Data Warehouse

Private Cloud

Re-evaluate existing BI Toolset and purchase

Tableau and Spotfire for visualization and analytics

Establish robust governance for effective use of the Data

Warehouse initiative

Implement Cisco Composite Data Virtualization Platform

to provide unified logical view of all the data

Implement Hadoop-as-a-Service using Google

Compute Engine and MapR

Appendix

Crimson 320

HadoopWhy MapR?Why Hadoop-as-a-Service?SecurityMapR Architecture

Enterprise Information ManagementCapabilitiesArchitectureWhy OpenText?

Master Data ManagementBusiness Intelligence Tools

Vendor MatrixAnalytical maturity modelIBM InfoSphere StreamsWhy InfoSphere?Security

CISCO Composite Virtualization layerFunctionalitiesWhy virtualization?Why Composite?CISCO ArchitecturesSuccess stories

TeradataCharacteristicsWhy Private Cloud?Operational IntelligenceSecurity

Information Governance teamCosts

ComponentsToolsCategorySavings

Why not the Oracle Exadata proposal

21

Comparative study of MapR, Cloudera, Hortonworks and Forrester’s ranking

Crimson 3

Robert D. Schneider (2014) “Hadoop Buyer’s Guide, Ubantu”Experfy.com

22

Benefits of moving Hadoop to the cloud

Crimson 3

1. Cost : The on-premise model for deploying Hadoop would require a large number of servers, electricity as well as a housing facility. Whereas the cloud deployment would be more cost effective since it offers better scalability and pay only for what you use.2. Scalability : The on-premise model would require time consuming addition of physical servers. The cloud offers massively scalable services extremely quickly3. Enhanced productivity : Using a cloud based Hadoop platform would enable data access anytime from anywhere, therefore providing greater and faster access to data4. Collaboration : A cloud based Hadoop platform would enable seamless collaboration across the business units. Since syncing and sharing of files would be simultaneous, the collaboration would be real time5. Elasticity : Hadoop clusters cannot be added or removed quickly, whereas Hadoop-as-a-service has the ability to increase or decrease number of clusters (instances) as per demand6. Handling Batch jobs : The on-premise Hadoop model has scheduled jobs that process the incoming data on a fixed, temporal basis. The Hadoop-as-a-Service can be optimized by having the appropriate sized clusters available for the jobs to run7. Simplifying Hadoop operations : In the on-premise model, as clusters are consolidated there is no resource isolation for different users. Hadoop-as-a-Service allows provisioning of clusters with different configurations and characteristics. Therefore management of a multi-tenant environment is simplified

23

Hadoop Security

Crimson 3

MapR offers several capabilities to help Cummins secure their data. At the product level MapR prevents unauthorized access to secure the Hadoop and NoSQL data. At the solution level MapR offers deployment of a large-scale anomaly detection solution that alerts you to network intrusion, phishing, and other cyberattacks.Authentication is performed through 1. Kerberos Integration2. Native authenticationAuthorization is the configuration of permissions for users. The authorization mechanisms offered by MapR are 3. Access Control Expressions4. Unix File Permissions 5. Access Control Lists

24

Hadoop Security

Crimson 3

MapR also accounts for regulatory compliance and therefore provides four types of auditing which are 1. maprcli commands that are related to cluster management2. Authentications to the MapR Control System (MCS)3. Operations on directories and files and Operations on MapR-DB tables. As an additional means of preventing unauthorized access of sensitive data, MapR supports encryption. The encryption mechanisms available are 4. Over-the-Wire Encryption5. Encryption at Rest6. Field-Level Encryption 7. Format-preserving Encryption and MaskingMapR also supports features that facilitate effective data governance. Among these are 8. Data Integration9. Security10. Data Lineage11. Information Lifecycle Management12. Auditing.

25

Security in MapR

Crimson 3

Kerberos Authentication Native Authentication

MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”

26Crimson 3

Security in MapR

Authorization

Auditing

Encryption

MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”

27Crimson 3

Security in MapR

MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”

28

Detailed MapR architecture

Crimson 3

MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”

29Crimson 3

Detailed MapR architecture

MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”

30Crimson 3

Detailed MapR architecture

MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”

31

Capabilities of the Enterprise Information Management suite

Crimson 3

Enterprise Content management : Information management of all types and sources of data, throughout it’s life cycle

Business Process Management : Rapid modeling and automation of process applications and the ability to constantly improve them

Customer Experience Management : Using information to build rich customer experiences that support collaboration, build relationships and provides support on any channel such as web, mobile etc.

Information exchange : Exchanging information with any party and system securely and verifiably

Discovery : Ability to find and learn about the right information at the right time and place, independent of it’s location

OpenText (2015) “OpenText Process Suite Platform Architecture”

32

Four layers of the EIM solution

Crimson 3

33

Gartner declares OpenText to be a leader in Enterprise Content Management

Crimson 3

https://en.wikipedia.org/wiki/Enterprise_information_management

34

Master Data Management

Crimson 3

5 Steps to implementing MDM1. Document: identify sources while

defining master data2. Analyze: Evaluate the way the data

flows in addition to defining transformation rules

3. Construction: Building the actual MDM warehouse according to the architecture/rules created

4. Implement: Population the data warehouse

5. Sustain: Make sure policies and compliance are upheld through Cummins governance structure

Reasons for having Master data Management• Standardization of data• Source identification• Data classification • Employee information management • Product information management • Eliminate duplicated data

Added business value because it organizes master data, making it possible to have effective BI tools. This then enable tools (being used properly) to receive information on business decisions.

https://www.quora.com/What-is-the-best-master-data-management-software

35

Buyer’s Matrix for BI Tools

Crimson 3

Solutions Review. “2016 Solutions Review Matrix Report.” 2015

36

Analytical Maturity Model

Crimson 3

“As an analytics platform, Spotfire offers you a variety of add-on capabilities as the sophistication of your environment grows, or as you climb up the analytics maturity curve, so to speak.” - Rishi Bhatnagar from Syntelli Solutions

Analytics Maturity Curve from Tom DavenportBhatnagar, Rishi. “How Much Does Spotfire Cost?” Syntelli Solutions. 25 July 2015

37

IBM InfoSphere Stream example

Crimson 3

Example of streaming data sources associated with smart meters

Typical Streams runtime deployment of a streaming application

IBM Analytics (2015) “Top industry use cases for stream computing”IBM Analytics (2015) “IBM Streams”

38

Forrester gives IBM high scores

Crimson 3

Forrester Wave : Big Data Streaming Analytics Platforms, Q3 ‘14

Mike G., Rowan C. (2014) “The Forrester Wave™: Big Data Streaming Analytics Platforms, Q3 2014”

39

InfoSphere Security

Crimson 3

Security is provided in InfoSphere Streams through user authorization and authentication. User authorization is managed through Access Control Lists which contains the roles and their access rights. User authentication is done either using an LDAP server or PAM authentication service. Authentication keys, session time outs and client authentication for web management services are some of the mechanisms adopted.

40Crimson 3

CEP vs IBMInfosphere

41

Discovery, optimize and caching for composite

Crimson 3

Discovery:1. Introspect available data2. Discover hidden relationships3. Model individual view/service4. Validate view/service5. Modify as requiredBenefits• Automates difficult work• Improves time to solution• Increases object reuseOptimization : 1. Application invokes request2. Optimized query (single statement) executes3. Deliver data in proper formBenefits:• Up-to-the-minute data• Optimized performance• Less replication requiredCaching :

1. Cache essential data2. Application invokes request3. Optimized query (leveraging cached data) executes4. Deliver data in proper form

http://www.compositesw.com/products-services/data-discovery/

42Crimson 3

Business case for virtualization• Profit Growth – Data virtualization delivers the information your organization

requires to increase revenue and reduce costs.• Risk Reduction – Data virtualization’s up-to-the-minute business insights help

you manage business risk and reduce compliance penalties. Plus data virtualization’s rapid development and quick iterations lower your IT project risk.

• Technology Optimization – Data virtualization improves utilization of existing server and storage investments. And with less storage required, hardware and governance savings are substantial.

• Staff Productivity – Data virtualization’s easy-to-use, high-productivity design and development environments improve your staff effectiveness and efficiency.

• Time-to-Solution Acceleration – Your data virtualization projects are completed faster so business benefits are derived sooner. Lower project costs are an additional agility benefit

http://www.compositesw.com/data-virtualization/

43Crimson 3

Virtualization versus Cloud• Security – Data integration in cloud , putting

the entire data of the business in cloud is a huge risk.

• Capacity management – Peak times, Holiday sales

• Redundancy of data without complete utilization of hardware resources

• In- house capabilities to handle http://www.businessnewsdaily.com/5791-virtualization-vs-cloud-computing.html

44Crimson 3

Key benefits of composite PROVIDES INSTANT ACCESS TO ALL DATA: • Complete information – Business needs the complete picture. Cisco’s data federation technology virtually integrates

data from multiple sources, without the cost and overhead of physical data consolidation.• Up-to-the-minute information – Cisco’s query optimization algorithms and techniques are fastest in the industry,

delivering the timely information business requires without impacting source system performance.• Fit-for-purpose information – Cisco’s powerful data abstraction functions simplify complex data, transforming it from

native structures and syntax into easy-to-understand business views and data servicesRESPOND FASTER TO ANALYTIC AND BI TRENDS:• Streamlined process – Building business views and data services in Cisco is far faster, with far fewer moving parts, than

building physical data stores and filling them using ETL.• Rapid IT response – Cisco’s reusable views and services, flexible data virtualization architecture, and automated impact

analysis provide the IT agility required to keep pace with business change.• Quick iterations – Prototyping new solutions is far faster with Cisco DV. Cisco’s rapid development tools surface live

data in just minutes, enabling extraordinary business and IT collaboration.END TO END DATA MANAGEMENT :• Data Discovery – Cisco’s introspection and unique-in-the-industry data discovery uncover existing information assets,

unlocking them for valuable new uses.• Standards-based – Cisco’s numerous standards-based access and delivery options support all the information types

business users require.• Data Governance – Information is a critical asset. To maximize control, Cisco’s data governance centralizes metadata

management, ensures data security, improves data quality and provides full auditability and lineagehttp://www.compositesw.com/products-services/data-virtualization-platform/

45Crimson 3

Criteria Composite Informatica IBM DenedoFederated query technology 5 4 3 1

Scalability 5 4 5 4Data quality 4 5 5 4Maintenance and support 4 5 4 4

Caching 5 4 4 2Profiling 5 4 3 2Costs 3 1 1 4Version upgrades 4 3 2 3Complexity of integrated Portfolio management

4 3 2 3

Metadata support 5 4 4 2Area of skills and Best practice documentation

4 3 3 2

Customer base 5 4 4 3Agility 5 4 4 3Time to value 5 4 4 3Compatibility with existing technologies

5 4 4 4

Forrester ranking 5 4 4 3Master data management 4 5 5 4

Total 72 65 61 55

Vendor evaluation matrix for composite

46Crimson 3

Cisco’s Data Virtualization Platform

Development Environment

Cisco Information Server

Runtime Server Environment Management Environment

XML

Packaged Apps RDBMS Excel Files Data Warehouse OLAP Cubes Hadoop / “Big Data” XML Docs Flat Files Web Services

Data WarehouseExtend / Offload

Governance, Risk & Compliance

Business Intelligence

Customer Experience Management

Mergers & Acquisitions

Single View of Enterprise Data

Supply Chain Management Analytics

Discovery

Studio

Adapters

Manager

Monitor

Active Cluster

http://www.compositesw.com/products-services/data-virtualization-platform/

47Crimson 3

Cisco’s Data Virtualization Platform

http://www.compositesw.com/products-services/data-virtualization-platform/

48

Composite creates virtual marts, views and services

Crimson 3

http://www.compositesw.com/data-virtualization/virtual-data-marts/http://www.compositesw.com/data-virtualization/operational-data-stores/

49Crimson 3

Packaged Apps Web Services

Success stories of Composite

Company Before After

QualcommBI projects took

3 - 4 months Days/Weeks

PfizerManagement requests

for data took weeks Hours/Days

Northern Trust

100% data replication 20% replication

http://www.slideshare.net/CiscoPublicSector/composite-data-virtualization

50

Characteristics of Teradata ADW Private cloud

Crimson 3

Main characteristics of Teradata ADW Private cloud include :Virtualized resources – Teradata virtualizes all processing and storage so users do not have to be concerned about the location or availability of system resources – only that they are getting timely answers to all their business questions automatically without performance penalty.

• Business analytics – a Teradata Data Lab makes it easier for business users to explore unique data sets or prototype new analytic ideas.

• Consistent performance – enables IT to meet business user service level agreements and to ensure user satisfaction by leveraging Teradata’s industry leading workload management as well as key technologies such as hybrid storage and columnar.

• Elasticity – delivers the analytic resources dynamically and in real time as business user demand increases and decreases.

• Scalability – enables the environment to scale seamlessly across multiple dimensions including number of users, number of queries, and data volumes with support for data scalability up to 92 petabytes.

http://www.teradata.com/News-Releases/2012/Teradata-Active-Data-Warehouses-Provide-Private-Cloud-Benefits-Today/?LangType=1033&LangSelect=true

51Crimson 3

Features of Teradata ADW private cloud

• Active access – high-speed inquiries, analysis, or alerts retrieved from the ADW and delivered to operational users, devices, or systems.

• Active events – operational events that need to be continuously monitored, filtered, and alerts sent based on business rules.

• Active load – high-frequency data loading throughout the business day to ensure data are fresh enough to support active access and active events.

• Active enterprise integration – links the ADW to existing applications, portals, Web services, service-oriented architectures, and the enterprise service bus.

• Active workload management – dynamic management of operational and strategic workloads in the same database, ensuring response times and maximum throughput.

• Active availability – increasing the data warehouse availability from business critical to mission critical.http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true

52

Private cloud adoption

Crimson 3

http://www.datamation.com/cloud-computing/what-is-private-cloud.html

53

Teradata provides operational intelligence

Crimson 3

http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true

http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true

54

Teradata provides operational intelligence - Framework

Crimson 3

http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true

55

Security in Teradata

Teradata’s Active Data Warehouse can make data available predictably and securely by leveraging Protegrity’s Vaultless Tokenization technology. Tokenization is applied to the sensitive data before it enters the warehouse, using the enterprise’s own security policies. This provides a security layer for all information in the database wherever it flows, without affecting the business’s ability to perform rapid analysis on that data. The solution relies upon Protegrity’s patent-pending Vaultless Tokenization, which deploys a very small set of lookup tables of random values without having to store either the sensitive data or the tokens. Tokenized data can be mined and manipulated by business processes without having to return the data to its original form, improving accessibility and performance while keeping the data protected.

Crimson 3

http://www.teradata.com/partners/Protegrity-USA/?LangType=1033&LangSelect=true

56

Information Governance Team• Legal: Department works with IT. Driven by policy issues

such as compliance and privacy

• Records/compliance/audit: Deal with record compliance, document workflow, and archiving strategies. Also make sure that policy is carried out enterprise wide

• IT: Helps with more technical issues making sure policies are configured in systems architecture.

• Info Security: assures that sensitive data is being held in secure repositories and the data does not leak into unsecure areas.

• Business Unit: Help to spread the policy and compliance information to the rest of their BU.

Crimson 3

Managing information through its lifecycle and supporting the organization’s strategy, operations, regulatory, legal, risk and environmental requirements.

This team will manage records, business intelligence and MDM policies, rules and

57

Cost of each component

Crimson 3

Hadoop $4000 per node for support• Software is one time cost• Cloud is ~$600 per TBMDM• $13,000 per collaboration server user (2) assuming $500 per user assuming 20 usersTeradata $2000 per TB• $2.5 million for in house supportOpentext• $2000 per user

58

Cost of tools

Crimson 3

59

Cost of each catagory

Crimson 3

60

Cost Savings

Crimson 3

These cost savings are based on how much cheaper it is to store data on the cloud as opposed to not

Also Operating expenses is an estimate that is derived from the increased amount of projects Cummins will be able to do with proper BI tools

People cost savings are derived from the less amount of people that will have to provide support

61

Cost Sources

Crimson 3

Componentshttp://googlecloudplatform.blogspot.com/2015/07/understanding-https://blogs.oracle.com/datawarehousing/entry/updated_price_comhttp://estore.gemini-systems.com/ibm/software-http://sheffieldview.com/2015/03/11/how-much-does-a-teradata-data-warehouse-appliance-cost/https://core.opentext.com/pricing.html

Tools http://www.ciosummits.com/Online_Assets_IT_Central_Station_Business_Intelligence_Tools_Report.pdfhttp://www.tableau.com/gartner-business-intelligence-costshttp://www.practicaldb.com/data-visualization-consulting/tableau-vs-spotfire/http://www.practicaldb.com/data-visualization-consulting/tableau-vs-spotfire/https://www.betterbuys.com/bi/roi-business-intelligence/

62

Our recommended solutions is better than the previously proposed Oracle Exadata solution for the following reasons

• Future trends like Cloud, Big data, consolidation across platforms and real time analytics is not supported by Oracle Exadata.

• High Scalability• High Availability • 90-95% Resource utilization • Data management • Easily can respond to changing BI and analytic trends • Cost savings – cut on maintenance and support costs, hardware costs, labor costs etc• Hadoop Cloud with MapR technologies has huge advantages – efficiency, collaboration

and scalability etc• Moving operational data to Teradata can provide near- real time data warehousing

which helps intelligent business decisions• Cummins end goal is to have single truth of data with availability, data quality, usability

which is met by Cisco composite data virtualization platform.

Crimson 3