bulletproof your application upgrades with secure … your application upgrades with secure data...
TRANSCRIPT
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1
Bulletproof Your Application Upgrades with Secure Data Masking and Subsetting
Jagan R. Athreya, Oracle
Senior Director, Database Manageability
JT Gaietto, SquareTwo Financial
IT Manager, Security and Compliance
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 2
The following is intended to outline our general product direction. It is intended
for information purposes only, and may not be incorporated into any contract.
It is not a commitment to deliver any material, code, or functionality, and should
not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products
remains at the sole discretion of Oracle.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 3
Program Agenda
Why Test Upgrades using Production Data
Data Masking
Data Subsetting
Test System Provisioning
Customer Case Study
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 4
Why Test Upgrades using Production Data?
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 5
Upgrades are Core to the Success of Your Business
Enhanced Product Areas New Products in 12/12.1
12 12.1
12.1.2 12.1.3
Planned Released
Major Release
Minor Release
11i10 12.2
Nov 04 Jan. 2007 May 09 Dec 09 Aug 10
RUP1…RUP2…RUP3…
8.52 8.51
Feature Pack Feature Pack
8.50
9.1
PeopleTools
Applications
8.53
9.2
8.55 8.54
Feature Pack Feature Pack
8.56
9.3
Discover Develop Test Migrate
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 6
Upgrade Process
Discover Develop Test Migrate
Business
Analyst
Application
Developer Software QA tester Application DBA
Define business
process
Develop custom
programs and
interfaces
Unit test and volume test
application Upgrade
Demonstration Development Test Production
Vendor sample PRODUCTION PRODUCTION Production
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 7
Production Data in Development and Test Benefits and Challenges
Benefits Challenges
Development • Robust application
functionality based on
real-world production use
cases
• Sensitive data exposed
to unprivileged users
• High storage costs due
to production data in
development
Test • Highly scalable and
performant application
post-upgrade based on
realistic concurrent testing
• Sensitive data exposed
to unprivileged users
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 8
Provisioning Secure Test Environments Data Masking and Data Subsetting
• Replace sensitive production
data with fictitious yet realistic
data in Development and Test
• Comply with regulations
while enabling realistic
application testing
• Provision a fractional
yet relationally intact
representation from
production data in
Development
• Reduce storage costs
while enabling realistic
application development
Bullet proof Application
Testing
Data Masking Data Subsetting
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 9
Data Masking and Data Subsetting of Enterprise Applications
Human Capital
Management Supply Chain
Management Financials
Customer
Relationship
Management
• What is the relationship between tables across the various applications?
• What are the data extraction rules that produces referentially intact subsets?
• Will the subsetted data fit into the available storage for test systems?
• What are the various types of sensitive data used across the applications?
• Where are these sensitive data stored and how are they related to each other?
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 10
Data Masking and Data Subsetting Methodology
• Find: Identify sensitive data and discover referential relationships
• Assess: Define the optimal data masking formats and subset criteria
• Secure: Provision secured test / dev systems using masking+subsetting
• Test: Verify the integrity of applications and security of data
DEPT LAST_NAME SSN SALARY
10 SMITH 111—23-1111 40,000
10 JOHNSON 222-34-1345 60,000
Production Non-Production
IBM DB2 MS SQL
DEPT LAST_NAME SSN SALARY
10 AGUILAR 203-33-3234 40,000
20 BENSON 543-44-2737 30,000
10 CHANDRA 323-22-2943 60,000
30 DONNER 270-24-4023 50,000
ORACLE
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 12
Data Masking Key Concepts
Sensitive Data
Identification Mask format
Referential
Relationships
Data
Masking
Definition
Application Data
Modeling
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 13
Sensitive Data Identification
What is sensitive data?
• Business driven
• Privacy law (HIPAA), industry regulations
PCIDSS)
• Confidentiality rules (data protection policies)
Where is sensitive data stored?
• Application Documentation
• User interface screens
• Schema-based, e.g. EMP.NATIONAL_ID
• Data-based, e.g. 16 digit #s beginning with 4
or 5.
Person Name Bank Account Number
Maiden Name Card Number (Credit or Debit Card Number)
Business Address Tax Registration Number or National Tax ID
Business Telephone Number Person Identification Number
Business Email Address Welfare Pension Insurance Number
Custom Name Unemployment Insurance Number
Employee Number Government Affiliation ID
User Global Identifier Military Service ID
Party Number or Customer Number Social Insurance Number
Account Name Pension ID Number
Mail Stop Article Number
GPS Location Civil Identifier Number
Student Exam Hall Ticket Number Hafiza Number
Club Membership ID Social Security Number
Library Card Number Trade Union Membership Number
Identity Card Number Pension Registration Number
Instant Messaging Address National Insurance Number
Web site Health Insurance Number
National Identifier Personal Public Service Number
Passport Number Electronic Taxpayer Identification Number
Driver’s License Number Biometrics Data
Personal Address Digital ID
Personal Telephone Number Citizenship Number
Personal Email Address Voter Identification Number
Visa Number or Work Permit Residency Number (Green Card)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 14
Sensitive Column Types Library
Pre-defined search patterns (Schema and Data based)
Search entire application for data matching patterns
Ranked based on match to search patterns
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 15
Data Masking Format Library
40+ out of the box mask formats for common PII data
Built-in mask primitives to expand format library
User-defined SQL or PL/SQL based functions for complex formats
Define once deploy everywhere
Self-updateable for Oracle supplied mask formats
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 16
Automatic Referential Integrity Database- and application-enforced
Read data dictionary to enforce referential
relationships Database Referential Integrity
Read application meta data (EBS, Fusion
Apps) to enforce referential relationships Application Referential Integrity
EMPLOYEE
EMPID NAME TITLE
12 SMITH SALESREP
13 JONES CSR
14 MCNEIL CEO
15 SALVATORE SALES MGR
CUSTOMER
CUSTID NAME REP_ID
200 ACME 12
201 BIG BOX 15
SUPPORT
CUSTID CSR_ID
200 13
Maintain same data distribution after
masking Data distribution
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 17
Sophisticated Masking Techniques Condition-based
Masking Compound
Masking
Compound masking
– Multi-column groups, e.g. address (street, city, state,
zip)
Condition-based masking
– Mask national identifiers, e.g. social security number,
by country-specific format
Deterministic (Repeatable)
– Repeatable consistent masking, e.g. referential
integrity across databases, or consistency across
refreshes
Reversible
– Key-based format preserving masking, e.g.
outsourced data processing
Shuffle
– Randomization within population set.
Perturbation
– Random mask generation around a value, e.g.
date of birth
Extensible (via PL/SQL)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 18
E-Business Suite Data Masking Templates
Spans entire application suite
– 950 Columns / 1900 rules
– HCM - Payroll, Employment Details, Personal Info
– TCA, ATG, Financials, Projects…
Best practices for identifying and masking sensitive data
– Columns, Relationships, and Masking rules for PII and Sensitive attributes
– Instructions for wiping credentials after cloning (Support Note 419475.1)
– De-identification done across the DB
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 19
What is not masked in the EBS template
Financial data
– Results
– Forecasts
Design Specs
Unstructured data
– Descriptive Flex Fields
‘Developer’ and ‘Localization’ flex fields masked where the content is
known
– Notes
– Attachments
Internal primary keys (user_id, person_id)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 20
Masking Application Users
EBS template masks names of Apps (FND) users
– Existing users not usable after masking
For seeded users (e.g. SYSADMIN) the names will be retained but
passwords will be reset
If there is a need to maintain local application users for testing
– New interface table provided to populate ‘ exempt users’:
Users exempted from masking
Option provided to rename users (JSMITH TEST1)
Specify a new password
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 21
• Built-into PeopleSoft
9.2
• Configurable via
PeopleTools 8.5.3
• Ships with out of the
box masking
techniques across
the application family
Data Masking for PeopleSoft
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 22
Application Templates – Self-update
Problem: Reduce the manual effort of
installing and configuring application
templates for data masking
Solution: Self-update
– Application masking templates
delivered from EM Store @
Oracle to EM software library
securely via self-update
– Mask formats and mask definition
transferred and imported into
data masking automatically
– Ready for implementation
EM @ Telco EM @ Govt. EM @ Retail
Database
Database
`
`
`
`
Database
EM @ Mfg.
Template
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 24
Data Subsetting Key Concepts
Sensitive Data
Identification Mask format
Referential
Relationships
Data Extraction
Clauses
Data
Masking
Definition
Subset
Parameters
Data
Subset
Definition
Application Data
Modeling
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 25
Subsetting Production Data
Criteria based
– Fiscal Year
– Geographical Region
North America
Europe
Asia
Europe
America
FY
20
09
FY
20
10
FY
20
12
FY
20
13
10%
30%
Storage goal based
– Percentage of production size
– Target storage footprint
120M
360M
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 26
Criteria Based vs. Storage Goal Based Subset Tradeoffs
Advantage Disadvantage
Criteria-based • Subsetted data
consistent with specified
data extraction criteria
• Application meta data
knowledge is a pre-
requisite
• Extracted data may not
fit into target system
without trial-and-error.
Storage Goal-
based
• Simple to specify, e.g.
10% or 120GB.
• Inconsistent data
provisioned in subset
• Not able to support
realistic testing
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 27
Criteria-based subset: Process
Select application
– Automatically select the tables
needed to be included in subset
Define extraction criteria
– Automatically traverse down
relational hierarchy to identify rows
Specify subset parameter
– Analyzes table statistics to
estimate size of the database
generated from the specified
subset parameter
Time
(FY:2011)
Dimension
(Region:Asia)
Space
(Size:10%)
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 28
Criteria based: Specifying Table Types
AP_INVOICES PO_VENDORS
AP_INVOICE_
LINES
SALES_
FACT
REGION
PRODUCT
Table sizes are a function of their table type
– Large tables: Transaction, Fact
– Small tables: Dimension, Lookup, Reference
Large Tables: Subset criteria specified including on their descendants
Small Tables: No subset criteria specified; tables brought over in full
CHANNEL
BUSINESS
UNIT MFG_PARTS
FISCAL
YEAR
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 30
Data Masking and Data Subsetting Integrated with all methods of database provisioning
RMAN copy
Thin Provisioning-based Cloning
Backup / restore
Data Pump Export/Import
Data Masking +
Data Subsetting
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 31
Database (Thin) Cloning Using Copy on Write
NAME SSN SALARY
AGUILAR 203-33-3234 40,000
BENSON 323-22-2943 60,000
Test Master Database
NAME SSN SALARY
SMITH 111-22-3333 60,000
MILLER 112-23-4567 40,000
Masking & Sub setting
Production Database
010110011
101001010
000111010
Functional Copies – User can create multiple copies for functional testing without consuming space
Cloud users can create their own snapshots and “time travel” to an earlier
snapshot
• Database Cloning in Minutes • Leverage storage copy on-write
technologies • Initial Support for Sun ZFS Storage and Netapp
Storage • More than Just Flexclone
• Integrated with Data Management(Masking & Subsetting)
• Configuration Management (Lineage and Association Tracking)
• Storage Aware “Metering & Chargeback”
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 32
At-Source Data Masking Maximum Compliance with PCI–DSS
Production data is copied to Test and
then masked leaving sensitive data
vulnerable until it’s masked
0100101100101010010010010010010010010010010010001
0010101001001001001110010010010010010010000100100
1011100100101010010010101010011010100101010010
Production Test
Clone-and-Mask
Before
Sensitive data is desensitized and de-
identified at the source before it leaves
the production DB; production DB
remains unchanged.
0100101100101010010010010010010010010010010010001
0010101001001001001110010010010010010010000100100
1011100100101010010010101010011010100101010010
At-Source-Masking
Masked Data
Pump File
Production Test
NEW
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 33
Integrated Subsetting and Masking High Performance Secure Test System Provisioning
Production data had to be subsetted first
and then sensitive data masked
separately.
Before NEW
0100101100101010010010010010010010010010010010001
0010101001001001001110010010010010010010000100100
1011100100101010010010101010011010100101010010
Production Test
Data
Subset
Clone
and Mask
0100101100101010010010010010010010010010010010001
0010101001001001001110010010010010010010000100100
1011100100101010010010101010011010100101010010
Test Masked Data
Pump File
Production
One-step Subset and Mask
Production data is subsetted and
sensitive data masked in one step using
At-source Masking
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 34
High Performance Data Masking and Subsetting
Mask 600 billion row table in 33 minutes
1% subset of 100TB table in 6.5 hours
1% subset + masking of 110TB table in 5.8 hours
Performance benchmarks on Exadata X2-2 full rack
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 35
Oracle Differentiators
Centralized comprehensive secure test system provisioning
– User-extensible library with sophisticated masking techniques
Integrated with application management lifecycle
– One tool for provisioning, monitoring, patching for entire Oracle application
infrastructure (application to disk)
– Integrated with testing tools
Optimized for Oracle Applications
– Application security built by the application development teams
High performance database-optimized execution
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 36
Customer Case Study: SquareTwo Financial
ABOUT US
WE ARE AMONG THE TOP DISTRESSED ASSET BUYERS IN THE UNITED STATES.
SquareTwo Financial is dedicated to making distressed assets whole.
Our sole focus is accelerating financial asset recovery through industry-leading asset modelling, award-winning technology and our pioneering legal Partners Network, which works with consumers to remedy their outstanding debt, while fully protecting the clients brand.
• Founded in 1994
• Based in Denver, Colorado
• Manage over $9.5 billion in assets
• Award-winning proprietary
technology
• Pioneering attorney-based Partner
Network
• A+ BBB rating
• Public company (trade bonds)
BUSINESS CHALLENGES
eAGLE is our internally developed financial services collection platform. eAGLE is responsible for housing 15.9 Million accounts. These accounts have a number of types of NPI data such as Social Security #, Address, Date of Birth, Credit History and so forth. The combination of the growth in accounts housed in eAGLE and the move to an Agile development methodology with two week sprints created development and testing delays to maintain compliance. In addition, the in-house tool was serial and was not able to multi-thread or utilize multiple cores of a CPU. Once specific fields were masked or redacted the data structure relationship structures were destroyed.
BUSINESS CHALLENGES
• Consumer Financial Protection Bureau [CFPB]
• PCI-DSS
• Sarbanes-Oxley
• HIPAA-Hitech
Increased compliance regulation
• 37% Year-over-Year Business Growth
• Expansion into new Financial Services verticals such as Commercial and HealthCare
• Business requirement to support the continued momentum and market share growth
Fast paced business growth
• In order to aid in the momentum and growth of SquareTwo and our eAGLE Financial Services Platform Development adopted an Agile approach to development.
Agile two-week development cycles
• Growth in database size to 3 TB required 14-15 day lead time to create a masked copy database
• Move to 2 week development sprints made it difficult to provide Masked Development and Testing environments
• Legacy tool was developed in house and was not designed to scale vertically making it unable to use multiple-CPU cores
Legacy Data “masking” tool issues
EVALUATION CRITERIA
Application aware:
Whatever solution must mask or redact specific all NPI data while leaving data structure relationships intact. While also not adding significantly to the DBA workload.
High Performance:
Creating new Development and Testing environments using the in-house tool currently required 14-15 day lead time to munge and replicate the eAGLE account database structure (3 TB).
Cost:
The initial estimate to move development resources to update our in house application and slate scheduled development work for a later date would cost SquareTwo Financial around $300k
WHY ORACLE
Cost
•Development time vs. Purchasing “off the shelf”
Ease of Implementation
•Integration with Exadata platform
•Time to implement and configure
•Familiar Partner
Trusted Partners
•SquareTwo Financial and Oracle have been strategic partners for over 10 years
•SquareTwo Financial was the first to install a full Exadata and Exalogic Frame
Scalability
•New solution must support multi-threading and multi-core architectures
•Solution must cut down the time to replicate environments (14-15 days for 3 TB of data)
CONSIDERATIONS FOR CHOOSING ORACLE
BUSINESS BENEFITS
Implementation:
SquareTwo Financial was able to have Oracle aid in the implementation of the Data Masking product. Since SquareTwo Financial already knew the data definitions and areas of concern, the installation of the Data Masking suite was completed in two weeks.
Outcome:
Once installed, the DBA team was able to replicate the 3TB eAGLE account database in 15 hours.
This directly lead to improved Development and QA testing support. Leading to an average of 6 active Development and Testing development environments for two separate sprint teams.
BUSINESS BENEFITS
Two week initial roll out of Data Masking solution
Move from 14-15 Day Database Refresh to 15 hour –
96% improvement!
Improved support for the QA and Development teams during their two week
sprint cycles
BEST PRACTICES
• Understand what your business requirements are based on the information stored in your database and what your compliance requirements are.
(e.g. Do you have Credit Card or Tax ID stored in your database?)
• Know what you want and have to mask based on legal and compliance requirements. Map this to your dataset before starting your masking project.
• Do NOT under estimate the value of existing partners, while a new partner maybe able to offer a lower price, there is always value with the partner you share more eggs with.
• Understand the difference between data masking and data subsetting and which of these functions you need. Some vendors include this in one tool, Oracle today does not.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 45
Oracle Database Security Solutions Defense-in-Depth for Data Protection and Regulatory Compliance
Activity Monitoring
Database Firewall
Auditing and Reporting
DETECTIVE
Separation of Duties
Multi-Factor
Authorization
Masking,
Encryption, Redaction,
PREVENTIVE ADMINISTRATIVE
Database Lifecycle
Management
Vulnerability Scanning
Data Discovery
and Classification
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 46
Additional Oracle Enterprise Manager Sessions
Time Session Title Location
Tuesday 1:15pm-2:15pm Database Lifecycle Management with Enterprise Manager 12c Moscone West 3020
5:00pm-6:00pm Deep Dive 3D on Exadata Management: From Discovery to Deployment to Diagnostics Moscone West 3018
Wednesday 11:45am-12:45pm Database as a Service using Oracle Enterprise Manager 12c Moscone West 3020
1:15pm-2:15pm DB Performance Tuning Boot Camp: 10 New Problem-Solving Tips Using ASH & AWR Moscone South 103
3:30pm-4:30pm SQL Tuning Experts Panel: Your Questions on Explain Plan to Exadata Answered Moscone South 102
Technical Sessions and Hands-on Labs
Time Hands-on Lab Location
Tuesday 1:15pm-2:15pm Database Performance Tuning Hands-on Lab Marriott Marquis - Salon 5/6
Wednesday 10:15am-11:15am Database Performance Tuning Hands-on Lab Marriott Marquis - Salon 5/6
11:45am-12:45pm Database Performance Tuning Hands-on Lab Marriott Marquis - Salon 5/6
Thursday 12:45 pm-1:45pm Database Performance Testing, Data Masking, and Subsetting Hands-on Lab Marriott Marquis - Salon 5/6