data services marketplace
TRANSCRIPT
Agenda1.Introductions
2.Logical Data Warehouse and Data Lakes
3.Coffee Break
4.Data Services Marketplaces
5.Q&A
Agenda1.Data Services Marketplace
2.Data Services Demo
3.Addressing the Challenges
4.Customer Success Stories
5.Q&A
Data, Data, Everywhere…
• Organizations are awash with data, but…
• How do I know what data is available?
• What’s its structure?
• How do I know how good it is?
• How do I access the data?
• Data Services Marketplaces address these
questions
• Provide a mechanism for end users and
developers to find and access data
• For reports, applications, analytics, etc.
And not a drop of it to read!
5
What is a Data Services Marketplace?
A single place where consumers of data –
developers or end users – can search for, find,
and access data, that is available to them, as a
service.
6
Data Services Marketplace
7
Enterprise Apps
SQL (JDBC/ODBC), RESTful Web Services, SOAP, JMS, etc.
OperationalSystems
AnalyticalSystems
Big Data External/SaaSSystems
VirtualData Marts Virtual ODS
Reusable Data Services
Metadata Scheduling & Delivery Usage Stats
Enterprise DataService Registry
Data ServicesLayer
Enterprise Data Service Registry
• Catalog of data available to consumers
• Metadata for data ‘services’
• Format and structure of data, description of data and attributes
• Data lineage information – where does the data come from?
• Access permissions for data services
• Enforcing privacy policies for PII
• Monitoring and auditing of data usage
• Monitoring and managing QoS/SLA
• Knowing who is access data, when and how…
8
Virtual Data Services Layer
A data access layer that abstracts underlying data sources and
exposes them as discrete services to form a ‘data API’
Different users and developers across the enterprise can access data in a
secure and managed fashion and share a common data ‘model’
Provides secure and managed access to data across the enterprise
Provides consistency of data
Hides complexity, format, and location of actual data sources
Supports many consumption protocols and patterns
Example: Single data access layer for all development teams to avoid
‘hunting down and interpreting data differently by project’
9
Data Services Layer
10
Enterprise Apps
SQL (JDBC/ODBC), RESTful Web Services, SOAP, JMS, etc.
OperationalSystems
AnalyticalSystems
Big Data External/SaaSSystems
Benefits of Data Services
• Agility
• Rapid development, service reuse, quicker time-to-value
• Data Integration
• Combine data to provide data ‘as needed’ not ‘as stored’
• Aligned with logical data models
• Data Quality
• Data consistency, common ‘model’
• Single Point of Interaction
• Users don’t need direct access to data sources, better management and
security
11
Challenges of Data Services
• Security
• How secure is the data? How is access controlled?
• Privacy
• How is PII protected? How can you audit access compliance?
• Performance/QoS
• Does the data services layer ‘get in the way’? How does it impact
performance? And QoS/SLAs?
• Data Governance and Veracity
• How do you know that the data is ‘good’?
12
13
Implementing Data Services
• Data services can be implemented using a
number of different technologies:
1. ESB/SOA
2. ETL
3. MDM
4. Data Virtualization
• Typically it will be one or more of the above
Different Technologies
14
Data Services with Data Virtualization
• Optimized for data services
• Configuration and not coding
• Rapid development and time-to-value
• Supports multiple delivery styles
• Real-time/right-time, batch/file, etc.
• Multiple protocols – SQL (JDBC/ODBC), Web Services (REST/SOAP), …
• Complements other technologies
• MDM exposed as services through data virtualization
• Combined with an ESB for process flows
The Foundation for the Data Services Marketplace
Challenges of Data Services
• Security & Privacy
• How secure is the data? How is access controlled?
• How is PII protected? How can you audit access compliance?
• Performance & QoS
• Does the data services layer ‘get in the way’? How does it impact
performance?
• How can we control the resources to comply with SLAs?
• Data Governance & Veracity
• How do we know that the data is ‘good’?
17
19
Security in DenodoOverview
Authentication• Pass-through authentication• Kerberos and Windows SSO• OAuth, SPNEGO
Authentication• Standard JDBC/ODBC security• Kerberos and Windows SSO • Web Service security
LDAPActive Directory
Role based AuthenticationGuest, employee, corporate
Schema-wide Permissions
Data Specific Permissions(Row, Column level, Masking)
Policy Based Security
Data in motion• SSL/TLS
Data in motion• SSL/TLS
Encrypted data at rest• Cache• Swap
20
Security in Denodo
Data in Motion – secure channels
Using SSL/TLS
Client-to-Denodo and Denodo-to-source
Available for all protocols (JDBC, ODBC, ADO.NET and WS)
WS security: Basic, Digest, SPNEGO (Kerberos), integration with LDAP
Data at Rest – secure storage
Cache: third party database. Can leverage its own encryption mechanism
Swapping to disk: serialized temporarily stored in a configurable folder that can be encrypted by the OS
Encryption/Decryption
Support for custom decryption for files and web services
Transparent integration with RDBMs encryption
Securing data
21
Security in Denodo
Authentication
Native and LDAP/Active Directory based
Support for Kerberos and Windows SSO
Authorization
Virtual Database
View
Row and Column level authorization
Masking
Custom policies for specific security constrains and integration with external policy servers
Roles
Integration with LDAP/AD groups
Role hierarchies supported
Pass-through session credentials
Leverage existing source privileges
Authentication and Authorization
25
Custom
Policy
Conditions satisfied
Security: applies custom security policies
• If person accessing data has role of 'Supervisor' and location is 'New York', then show compensation information for employees in the New York office only.
Enforcement: rejects/filters queries by specified criteria like user priority, cost, time of day etc.
• If the production batch window runs from 3 am - 6 am, there is increased load on production servers at this time. So, all queries on these servers can be blocked during this time to prevent failure of a process.
Data consuming users, Apps
Query
Accept / add filters
Reject
Security - Custom PoliciesInterception of queries before they are executed
27
Resource Manager
Apply resource restrictions based on a set of rules
Rules classify sessions into groups
By user, role, application, IP, time of the day, etc.
E.g. Connections from application ‘app1’ coming from users with role
‘reporting’ are assigned to a group
Apply restrictions for each group.
Change priority, change concurrency settings, change max timeouts, etc
Controlled Resource Allocation
28
Resource ManagerControlled Resource Allocation
1 Defines a rule that will be triggered for “app1” and users with the role “reporting”
2 For those request that fulfill the rule, if the CPU usage is greater than 85%, will apply the following:• Reduce thread priority• Reduce the number of concurrent requests• Limit the number of queued queries
29
Performance FeaturesData Provisioning Layer
Selective Materialization
Intelligent Caching of only the most relevant and often used information
Streaming & pagination
Operate on data in streaming mode for a low memory footprint. Paginate responses to control the size of datasets
Parallelism
Parallel access to disparate sources to minimize latency
NESTED JOINs for concurrent access to sources with restricted query capabilities
Optimized Resource Management
Smart allocation of resources to handle high concurrency
Throttling to control and mitigate source impact
Resource plans based on rules
30
Quality of Service in Real Scenarios
• Multinational insurance & reinsurance company
• Average response time of 80-100ms
• 200+ concurrent queries
• 2 nodes – 4 cores each
• Global semiconductor chip manufacturer
• Enterprise-wide data access layer
• 200+ developers trained in Denodo
• ~50 data sources, +90 data services published
• Response times under 120ms, well in compliance with their internal SLAs (200-300ms)
• 128+ cores in production
Data Provisioning Layer
32
Enterprise Data Governance
Understand the “source of truth” and transformations of every piece of data in the
model
Data lineage
33
Enterprise Data Governance
Understand the “source of truth” and transformations of every piece of data in the
model
Data lineage
35
DrillingInfo
• SaaS-based platform that provides business intelligence and
decision support technology
• Facilitates faster, smarter decisions for the oil and gas upstream
E&P industry
• HQs in Austin, Texas. More than 400 employees on 5 continents
• Services 3,000+ companies globally
Overview
37
-Jay Heydt, Manager, Drillinginfo
As a data and business intelligence provider, one of our biggest
challenges is the need to rapidly sell the data that we acquire. The
Denodo Platform enables us to build and deliver data services to our
internal and external consumers within 3–4 hours instead of the 1–2
weeks that would take with ETL”
40
Guardian Life
• Large mutual life insurer with $7.3 billion in capital and $1.5 billion in operating income in 2015.
• Founded in 1860, the company has paid dividends to policyholders every year since 1868.
• ~8,000 employees and a over 3,000 financial representatives in 70+ agencies nationwide.
• Offerings:
• Life insurance
• Disability income insurance
• Annuities
• Investments to dental, vision, and 401(k) plans.
Overview
Thanks!
www.denodo.com [email protected]
© Copyright Denodo Technologies. All rights reservedUnless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.