whamtech smartdata fabric - data security layer
TRANSCRIPT
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
SmartData Fabric®
Data Security Layer (DSL)
May 2021
Revision 4.1 Copyright 2021 WhamTech, Inc. 1
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Conventional cybersecurity and DLP are not working!
Revision 4.1 Copyright 2021 WhamTech, Inc. 2
All stats going in the wrong direction -
estimated for 2020 (compared to 2019):• Over 30 billion records exposed (up 200%)
• 8,000 incidents (up 110%)
• Over 3.8 million records per incident (up 180%)
• % encrypted of data exposed dropped to < 1%
from 4% in the past
• Taking longer to discover data thefts
• Insider-related thefts increasing, approaching
40% of all incidents, maybe higher
• Stats are mixed, but taking longer to discover
data thefts, as the number of incidents
increases
• Data theft costs and fines are increasing
• Cybersecurity costs are increasing
The outlook is very poor for PHI/HIPAA, PCI, PII, CCPA/CCPR, GDPR and other regulations, and national security
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
WhamTech enterprise solution data security use case
SmartData Fabric® data virtualization across multiple data sources to provision highly curated data
directly to Tableau for large US federal contractor, and partnering with vendor Tableau(Ref. 1), to enable:
• Advanced semi-automated data discovery and classification(Ref. 2)
• Metadata management, data governance and master data management
• Active Directory (AD)-based Identity Authentication Management (IAM), Role-based Access Control (RBAC),
Single Sign-on (SSO), Attribute-based Access Control (ABAC)/Row-level Security (RLS) and Column-level Security
(CLS) through Tableau to almost any backend data source system, regardless of these systems’ support for any
of these access control or data security measures
• Oracle Virtual Directory (OVD) to allow cross/multi-domain user management, required because of M&A
• De facto data virtualization, federation and integration Data Security Layer (DSL) for the contractor and
eventually their main customer, the US federal government
• Military-grade access control and data security on data sources with indexed, conventional (unindexed)
federated adapters or both simultaneously
• Eventual support for User Behavior Analytics (UBA) by combining metadata, data governance, user logs, access
control, data security, virtual graph/link analysis and interactive graph/link visualization
Revision 4.1 Copyright 2021 WhamTech, Inc. 3
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
OUTCOME:
WhamTech SmartData Fabric® can apply enterprise-level access security, data
security and more, on ANY source system, by leveraging LDAP/AD-based IAM,
RBAC, SSO, ABAC/RLS, CLS and cross/multi-domain, regardless of the source
system’s support for any of these access control or data security measures
Revision 4.1 Copyright 2021 WhamTech, Inc. 4
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
SmartData Fabric® Data Security Layer for indexed adapters
Revision 4.1 Copyright 2021 WhamTech, Inc. 5
Enterprise-level Security
Data-level Security
WhamTech SmartData Fabric® DSL is the GATEKEEPER that
bridges the gap between the enterprise and access, and data
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
SmartData Fabric® Data Security Layer for conventional
(unindexed) adapters
Revision 4.1 Copyright 2021 WhamTech, Inc. 6
Enterprise-level Security
Data-level Security
WhamTech SmartData Fabric® DSL is the GATEKEEPER that
bridges the gap between the enterprise and access, and data
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
How the Data Security Layer fits with other security components
Revision 4.1 Copyright 2021 WhamTech, Inc. 7
DSL: Data
discovery,
profiling,
classification,
indexes, metadata,
data governance,
master data,
LDAP/AD, IAM,
RBAC, SSO, ABAC/
RLS, CLS, VD,
BI/reporting and
analytics, including link
and user behavior
analytics
CS: Additional
cybersecurity capabilities
involving log indexes
and integration, and link
and event correlation
with other data sources
Host
Level
Intersection of data security
and cybersecurity
Cyber-
security
Data
Security
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
The following are based on the
Forrester Zero Trust Data Security Framework =
Discover, INDEX, Classify and Secure(Refs. 3, 4 and 5)
Revision 4.1 Copyright 2021 WhamTech, Inc. 8
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
State of Cyber Attacks
Revision 4.1 Copyright 2021 WhamTech, Inc. 9
Losing Data
Security War
Successful attacks more frequent and
larger in scope
Attacks taking longer to discover
Attacks more sophisticated
Experts in increasingly short supply
Analytics still highly manual on increasing volumes of
data
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Accept that external origin and insider threats are
present in the enterprise…
Revision 4.1 Copyright 2021 WhamTech, Inc. 10
…whether anyone is aware of them or admits it,
or not!
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Ultimate goal:
Protect the Crown Jewels (aka Valuable Data)
Revision 4.1 Copyright 2021 WhamTech, Inc. 11
Focus less on building better mousetraps
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Regulatory data protection requirements
Increasing number and breadth of personal data that needs to be protected:
• Personal Health Information (PHI) – HIPAA, etc.
• Personal Credit Information (PCI)
• Personal Identifiable Information (PII)
… and the most far-reaching yet:
• California Consumer Privacy Act (CCPA) and EU General Data Protection Regulation (GDPR)
– Very broad definition of personal data
– Example use case is listed at the end of this presentation
• Others…
Apart from massive costs and fines associated with data breaches associated with the above, there
are business reputation losses
Revision 4.1 Copyright 2021 WhamTech, Inc. 12
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Zero Trust Data Security Framework
Revision 4.1 Copyright 2021 WhamTech, Inc. 13
Dissect
Data intelligence Data analytics
Defend
Access Inspect Dispose Kill
Define
Data discovery Data classification Assume a data-centric
security approach – a
balance between security
and enabling/unleashing
a digital workforce
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Zero Trust Data Security: Classify the data
Revision 4.1 Copyright 2021 WhamTech, Inc. 14
Kill
Dispose
Access and
Inspect
Focus on
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Dynamic data classification is key part of data life-cycle
Revision 4.1 Copyright 2021 WhamTech, Inc. 15
Deletion
Discovery and
inventory
(existing)
Modification
(mask, tokenize and/or encrypt)
Classification
and labeling
Retention
Creation
(new)
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Cupps*: Inventory, defend and protect = ultimate whitelisting
Revision 4.1 Copyright 2021 WhamTech, Inc. 16
Data
AuthorizationsTransactions
✓
What protection as created, at
rest, in transit, at recipient and
after needed? Physical and
logical locations, owner,
business risk (classification)
and expiration.
What is
allowed to be
done to the
data?
Who, where, when
and how vetted?
Organizations, users,
roles, applications,
processes and
sources.
Overlap is control
*Source: James Cupps, CSO, Liberty Mutual
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Summary of WhamTech data security-related options
• Semi-automated data discovery and classification(Ref. 3)
• Metadata management
• Data governance
• Master data management
• Support for LDAP/AD, IAM/Kerberos, SSO, RBAC, ABAC/RLS, CLS and cross/multi-domain
• User authentication
− Pass through organization credentials, e.g., Single Sign-on (SSO)
− Proxy in as different roles
• WhamTech RBAC (aka Security and Privacy Access Profiles (SPAPs))(Ref. 6)
• Security indexes
− Row and column (data element, possible future option)
• Dynamic data masking, tokenization and encryption, including third-party Format-Preserving Encryption (FPE)
• Result-set encryption
• Encryption of indexes, files, folders, directories and/or disk volumes – physical, virtual or logical
Revision 4.1 Copyright 2021 WhamTech, Inc. 17
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Example use case: EU General Data Protection Regulation (GDPR)
Revision 4.1 Copyright 2021 WhamTech, Inc. 18
• Multiple disparate systems
• Missing single person views of
all related data
• Poor data quality – avoids
detection
• Poor access control
• Poor data anonymization
• No way to anonymize or
delete opt-out data
• Difficult to know where
private data is located
• Difficult to determine if data
can be readily used to identify
individuals
• Broad definition of personal
data
• Impacts marketing
• Severe fines
THE CHALLENGES
• Use indexes to discover data
• Dynamic data masking,
tokenization and encryption
• Single person views of data
• Individual opt-in and opt-out,
and write back to systems
• Index or data anonymization,
or delete data for opt-outs
• Graph database/visualization
• Could be Cloud-based
• Discover and profile data
• Ensure any and all data found
• Mitigate risk data
• Enable individual opt-in to
marketing and opt-out to
data
• Generate compliance reports
• RBAC-based data security
• Establish auditable processes
THE SOLUTIONS
• Advanced data-centric
systems now available
• Potential Cloud management
of on-premise systems
• Addresses a massive and
looming deadline problem
• Minimally intrusive solution
• Rapid and incremental
deployment
• Could be a short-term fix that
tends to longer-term solution
• Automatically connect people
to their personal data
• Explore additional risks
• Open up new channels of
customer interaction
• Could be seen as competitive
advantage
THE BENEFITS• EU GDPR is being
adopted worldwide,
including a few US
states
• US companies with an
EU entity or employees
must comply
• Severe fines of up to
4% of gross worldwide
revenue
• Personal data definition
is very broad and
includes cookies, email
and IP addresses
• Includes private, public
and work data
• Marketing is opt-in, not
opt-out
BU
SIN
ES
S
SU
MM
AR
Y
TE
CH
NIC
AL
SU
MM
AR
Y
Source: : #WhamTech SDF Use Case - EU General Data Protection Regulation (GDPR)
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
References
1. #WhamTech SDF Use Case - Integrated business intelligence capabilities for employees
2. #WhamTech Automated SmartData Discovery and Classification
3. “The Future of Data Security: A Zero Trust Approach”, Forrester Research, Inc., June 5, 2014.
4. “Rethinking Data Discovery and Data Classification”, Forrester Research, Inc., October 1, 2014 (an
update to a previous publication)
5. “Navigate the Future of Identity Management”, Forrester Research, Inc., April 7, 2014
6. “WhamTech Security and Privacy Access Profiles for Federated Data”
Revision 4.1 Copyright 2021 WhamTech, Inc. 19
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
The End
Appendices:
Security architecture for SSO in single and multiple domains
Detailed Forrester recommendations
Revision 4.1 Copyright 2021 WhamTech, Inc. 20
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Tableau Single Sign On (SSO) with Row Level Security (RLS) at SDF
In Single Domain Deployment
Revision 4.1 Copyright 2021 WhamTech, Inc. 21
ADLocal Domain
Tableau Server
SDF
DS1 DS3DS2
Tableau
DesktopSDF ODBC
Driver
AD
Local
View/edit reports
Publish reports/
Access data source
TS Service account verifies
login user Identity with AD,
logs on as that user,
impersonates the user and
allows delegation
Tableau Server loads ODBC driver as the
logged-in user, which identifies the SDF
Service and validates the user identity with
AD, sends ticket to SDF (delegation)
SDF Server verifies
the end-user with
AD and checks
access
**SSO
successful
AD finds the TS Service and authenticates the login
user for SSO in to Tableau Service - AD notifies TS
Service about the user trying to access
*User opens Tableau URL in
browser, which causes AD to
verify the TS service for the
user.
***User opens Tableau
workbook/report in browser
Tableau authorizes access to the
user after checking in key tab file
Impersonates logged-in user and
delegates for RBAC, ABAC/RLS and CLS -
user can only access their records
9
8
7
6
5
4
3
2
1
Key
Tab
File
1
5
4 **
***
*
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Trust/AccessXYZ trusts ADLocal.
ADLocal users can access XYZ
resources.
ADLocal selectively trusts XYZ.
Only selected XYZ users (TS users)
can access ADLocal resources.
External Domain (XYZ)
ADLocal
Domain
Multi-domain Deployment – Two-way with Selective Authentication
SDF-Trusted Proxy Solution
Revision 4.1 Copyright 2021 WhamTech, Inc. 22
SDF
DS DSDS
Tableau
Desktop
AD
Tableau Server
SDF ODBC
Driver
SDF Federation
ADOVD^
XYZ AD knows about SDF
service and returns a valid
ticket, and XYZ user can find
and delegate to SDF
Federation
Implement RBAC,
ABAC/RLS and CLS for
delegated userConfirm User Identity
SDF need not identify and
check for access for the user
at the server side
4 **Kerberos SSO to authenticate user to TS – a valid
Kerberos ticket is assigned to a user when they logon to
system, which uses this ticket to find the TS service
2
6 Load ODBC Driver on
behalf of
authenticated user
(delegation)
8
9
AD
4
Trusted proxy user passes along the user identity to
implement RLS – as SDF is trusted, user identity
confirmation with AD or a valid ticket to access SDF
resources on ADLocal domain, is not required
8
1
5
3
1
5
7
Tableau
authorizes
access to the
user after
checking in key
tab file
1
5
*User opens Tableau URL in
browser, which causes AD to verify
the TS service for the user.
***User opens Tableau
workbook/report in browser
*
**
* **
***
*
^OVD = Oracle Virtual Directory
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Forrester recommendations
• Identify data where it resides− Exists everywhere on many devices in many forms, e.g., databases, mainframes, office docs, email, file shares,
flat files, Web sites, social media, BYOD, IoT devices, etc.
• Apply labels to classify data and determine how to handle
− KISS - basically, two types of data:
− That someone wants to steal
− Everything else
− In reality, there are probably at least three main classifications: (i) Unclassified, (ii) toxic and (iii) radioactive
• Engage in real-time, dynamic data classification− Existing and new data
• Eliminate the idea of a trusted internal network and untrusted external network− Adopt a Zero Trust Data Security Approach
Revision 4.1 Copyright 2021 WhamTech, Inc. 23
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Zero Trust Data Security: Define the data
• Cannot protect/encrypt all data
−Focus on radioactive and toxic data
• Discover and index all existing and new data wherever it exists
• Develop a life-cycle approach to data
• Classify individual data
−Classification of the most valuable individual data determines the classification
of a data-set
Revision 4.1 Copyright 2021 WhamTech, Inc. 24
SmartData Fabric® security-centric distributed data virtualization, master data management and virtual graph database
Zero Trust Data Security: Dissect and analyze the data
• Profile and analyze existing and new data for classification
• Enable classification to change:
−As the value of data changes over time
−Because of changes in government or industry regulations
−As new threat intelligence becomes available, particularly, if targeted on a
specific industry or enterprise
• Correlate classification with data exfiltration
Revision 4.1 Copyright 2021 WhamTech, Inc. 25