data science and pending eu privacy laws - a storm on the horizon
TRANSCRIPT
PRIVACY CONSIDERATIONS WITH DATA AND DATA SCIENCE
• Intro & Case Studies
• Data & Data Science: Growth and Usage
• Privacy: Storm on the Horizon
• Concluding Thoughts
Agenda
2
My BackgroundIntro & Case Studies
Head of Global Business
Analytics
Professor
(Advanced Analytics)
Ph.D. Analytics &
Computer Science
Financial Analytics,
Credit Risk and Insurance
Independent Consultant
3
PRIVACY CONSIDERATIONS WITH DATA AND DATA SCIENCE
• Intro & Case Studies
• Data & Data Science: Growth and Usage
– Data Science
– Modern Technology
• Privacy: Storm on the Horizon
Agenda
6
PRIVACY CONSIDERATIONS WITH DATA AND DATA SCIENCE
• Intro & Case Studies
• Data & Data Science: Growth and Usage
– Data Science
– Modern Technology
• Privacy: Storm on the Horizon
Agenda
10
Source and Use of Customer DataPrivacy: A Brief Background
Can be
Known
UsedStored
Shared with
3rd parties
Observed
VolunteeredData
Science
17
PRIVACY CONSIDERATIONS WITH DATA AND DATA SCIENCE
18
• Intro & Case Studies
• Data & Data Science: Growth and Usage
• Privacy: Storm on the Horizon
Agenda
Preparing for compliancePrivacy: Storm on the Horizon
What are my
data assets?
UsageStorage
Flow to/from
3rd parties
Observed
VolunteeredData
Science
Right to be forgotten
De-anonymization
Cloud computing
Explicit and up-
front consent
Restricted profiling
Privacy by Design
Potential liabilities from
buying, selling and sharing
20
Moving ForwardPrivacy: Storm on the Horizon
Become aware of your entire data ecosystem and how it may expose
you to privacy violations
Audit current data storage and governance for compliance
Ensure that all product roadmaps comply with the principles of
Privacy by Design
21
Ensure that proper user consent is in place from the moment of first
user registration
Initiate dialogue with corporate privacy officer or external expert
Privacy by Design
24
1 Proactive not Reactive; Preventative not Remedial
2 Privacy as the Default Setting
3 Privacy Embedded into Design
4 Full Functionality – Positive-Sum, not Zero-Sum
5 End-to-End Security – Full Lifecycle Protection
6 Visibility and Transparency – Keep it Open
7 Respect for User Privacy – Keep it User-Centric
Privacy by Design for Big Data (Jeff Jonas, IBM)
25
1. FULL ATTRIBUTION: Every observation (record) needs to know from where it came and when. There cannot be
merge/purge data survivorship processing whereby some observations or fields are discarded.
2. DATA TETHERING: Adds, changes and deletes occurring in systems of record must be accounted for, in real time, in sub-
seconds.
3. ANALYTICS ON ANONYMIZED DATA: The ability to perform advanced analytics (including some fuzzy matching) over
cryptographically altered data means organizations can anonymize more data before information sharing.
4. TAMPER-RESISTANT AUDIT LOGS: Every user search should be logged in a tamper-resistant manner — even the
database administrator should not be able to alter the evidence contained in this audit log.
5. FALSE NEGATIVE FAVORING METHODS: The capability to more strongly favor false negatives is of critical importance
in systems that could be used to affect someone’s civil liberties.
6. SELF-CORRECTING FALSE POSITIVES: With every new data point presented, prior assertions are re-evaluated to
ensure they are still correct, and if no longer correct, these earlier assertions can often be repaired — in real time.
7. INFORMATION TRANSFER ACCOUNTING: Every secondary transfer of data, whether to human eyeball or a tertiary
system, can be recorded to allow stakeholders (e.g., data custodians or the consumers themselves) to understand how
their data is flowing.