© 2011 ibm corporation 1 privacy by design (pbd) confessions of an architect privacy by design |...

38
© 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011 Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics [email protected]

Upload: emerson-musgrove

Post on 14-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation1

Privacy by Design (PbD)Confessions of an Architect

Privacy by Design | Time to Take ControlToronto, Canada

January 28th, 2011

Jeff Jonas, IBM Distinguished EngineerChief Scientist, IBM Entity Analytics

[email protected]

Page 2: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation2

Background

Early 80’s: Founded Systems Research & Development (SRD), a custom software consultancy

1989 – 2003: Built numerous systems for Las Vegas casinos including a technology known as Non-Obvious Relationship Awareness (NORA)

2005: IBM acquires SRD, now chief scientist of IBM Entity Analytics

Personally architected, designed and deployed +/- 100 systems, a number of which contained multi-billions of transactions describing 100’s of millions of entities

Selected Affiliations:– EPIC, Member, Advisory Board

– Privacy International, Member, Advisory Board

– Markle Foundation, Member, Task Force on National Security in the Information Age

– Senior Associate, Center for Strategic and International Studies (CSIS)

Page 3: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation3

A Late Bloomer to Privacy

1980 – 2001 No clue whatsoever

2001 – 2006 Slowly waking up

2007 – 2011 Today, at best, a student of

privacy

Page 4: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation4

A Journey Fraught with Reflection and Rethinking

The greater my privacy and civil liberties awareness

The greater the number of imperfections appear in the rearview mirror

Page 5: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation5

Katrina – Missing Persons Reunification Project

Information about status of persons quickly end up scattered across countless databases

– Over 50 such web sites/organizations were identified as having victim related data

– Many people were registered duplicate times in the same database

– Many people were registered duplicate times across databases

– Many people were registered as missing in one database and found in another database

Connecting found persons previously reported as missing becomes nearly impossible

– Too many databases

– Constantly changing data

Page 6: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation6

Katrina Reunification Project Statistics

Total data sources 15

Usable records 1,570,000

Unique persons 36,815

Total loved ones reunited >100

Page 7: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation7

Katrina – Missing Persons Reunification Project

Privacy by Design– Contractually authorized to delete all the

data after the reunification office completed its work

– Hence, a few months later, all collected data and reporting products were deleted

DESTRUCTION OF EVIDENCE!Data Decommissioning – Destruction of Accountability

Page 8: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation8

“G2”My Skunk Works Project

Page 9: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation9

G2: Sensemaking on Streams

1) Evaluate new information against previous information … as it arrives.

2) Determine if what is being observing is relevant.

3) Deliver this relevant, actionable insight fast enough to do something about it … as it’s happening.

4) Do this with sufficient accuracy and scale to really matter.

Page 10: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation10

From Pixels to Pictures to Insight

Observations

Contextualization

Information inContext

Relevance

Consumer(An analyst, a system, the sensor itself, etc.)

Page 11: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation11

G2: Sensemaking on Streams

Domain: People, organizations, places, things, events … proteins, asteroids, and more.

Will simultaneously commingle and make sense over structured, unstructured, biographic, biometric and geospatial data

Multi-lingual

Even curious: If it is unsure, it figures if it is worth researching and may choose to ask Google or maybe even Jeopardy champion to clear up any confusion

Page 12: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation12

Harnessing Big Data. New Physics.

More data: better the predictions

More data: bad data … good

More data: less compute

Page 13: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation13

Smarter Planet: Example G2 Use Cases

Traffic optimization– Route suggestions pushed to drivers, just-in-time, to

avert significant traffic events

Optimize individual lives– Search results optimized based on predictions about

where you are going next

Pandemic response– A nation able to work right through an extreme global

pandemic with real-time citizen recommendations (e.g., “quarantine yourself!”)

Page 14: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation14

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. ALTHOUGH EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE.

Page 15: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation15

IBM InfoSphere Sensemaking V1.1.0.0

Following two years of skunk works development while guided by privacy by

design goals …

it is just possible that there are more privacy and civil liberties enhancing

capabilities baked-in, during conception and design, than any other general

purpose advanced analytics technology commercially available … on Earth … to

date.

Page 16: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation16

PbD: Full Attribution

ABOUT THE FEATURE Every record knows where it came from and when No merge/purge data survivorship processing

IMPORTANCE Universal Declaration of Human Rights has four

articles containing the word “arbitrary” e.g., Article 9 reads “No one shall be subjected to arbitrary arrest, detention or exile.” If you don’t know where the data came from, how can this be non-arbitrary?

The ability to identify every original record is essential for reconciliation and audit

Page 17: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation17

PbD: Data Tethering

ABOUT THE FEATURE Adds, changes and deletes from source systems

can be processed Real-time, sub-second (not requiring periodic

batch reloading)

IMPORTANCE Data currency in information sharing

environments is important e.g., when derogatory data in error is corrected in a source system, it is vital such corrections are corrected everywhere, immediately

Page 18: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation18

PbD: Analytics on Anonymized Data

ABOUT THE FEATURE Owners of data can anonymize selected fields before

an information transfer Despite the cryptographic form of the data, deep

predictive analytics (including some fuzzy matching) can still be accomplished when fusing this data for discovery and analysis

IMPORTANCE With every copy of data, there is an increased risk of

unintended disclosure Data anonymized before transfer and anonymized at

rest reduces the risk of unintended disclosure And with full attribution, re-identification is by design to

ensure reconciliation and audit

Page 19: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation19

PbD: Tamper Resistant Audit Logs

ABOUT THE FEATURE Who searches for what is logged in a consistent

manner Even the database administrator cannot alter the

evidence contained in this log

IMPORTANCE Every now and then people with access and

privileges take a look at records without a legitimate business purpose, e.g., an employee of a banking system looking up their neighbor

Tamper resistant logs make it possible to audit user behavior and can cause chilling-effects on misuse

Page 20: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation20

PbD: False Negative Favoring Methods

Patrick T Smith340-900-9000

Patricia Smith340-900-9000

Pat T Smith340-900-9000

Student

??

1 2

3

Patrick T Smith340-900-9000

Patricia Smith340-900-9000

Pat T Smith340-900-9000

Student

Closest. Hence, for sure

EXISTING BEST PRACTICE

1 2

3

Page 21: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation21

PbD: False Negative Favoring Methods

ABOUT THE FEATURE A false negative occurs when something that is true is not

detected Sometimes a new record can belong to two different

entities Usually systems select the strongest of the two But had there been only one choice, it would have matched

to the other This is now properly handled, in real-time

IMPORTANCE If a new record gets arbitrarily assigned, you may have

inadvertently created a false positive False positives can adversely effect peoples lives – e.g., the

police find themselves knocking down the wrong door or an innocent passenger is denied the ability to board a plane

Page 22: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation22

PbD: False Negative Favoring Methods

Patrick T Smith340-900-9000

Patricia Smith340-900-9000

Pat T Smith340-900-9000

Student

?? NEW

BEST PRACTICE

Patrick T Smith340-900-9000

Patricia Smith340-900-9000

Pat T Smith340-900-9000

Student

100%100%

1 2

3

1 2

3

Page 23: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation23

PbD: Self-Correcting False Positives

Which reveals this is a FALSE POSITIVE

John T Smith Jr123 Main Street703 111-2000

DOB: 03/12/1984

John T Smith123 Main Street703 111-2000

DL: 009900991

A plausible claim these two people are the same

1

2 John T Smith Sr123 Main Street703 111-2000

DL: 009900991

Until this record comes into view

3

Page 24: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation24

PbD: Self-Correcting False Positives

John T Smith Jr123 Main Street703 111-2000

DOB: 03/12/1984

John T Smith123 Main Street703 111-2000

DL: 009900991

John T Smith Sr123 Main Street703 111-2000

DL: 009900991

New Best Practice:FIXED IN REAL-TIME

(not end of month)

John T Smith123 Main Street703 111-2000

DL: 009900991

1

3

2

2

Page 25: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation25

PbD: Self-Correcting False Positives

ABOUT THE FEATURE A false positive is an assertion (claim) that is made, but not true With every new data point presented, all prior assertions are re-

evaluated to ensure they are still correct, and if now incorrect, these are repaired

If two people were thought to be the same because they share the same name, address and phone – then later it is discovered this is a JR and SR (two different people), this is now remedied

In real-time, not end of month

IMPORTANCE False positives can adversely effect peoples lives Without self-correcting false positives, databases start to drift from

the truth and become visibly wrong – necessitating periodic reloading to fix this

Periodic monthly reloading would mean wrong decisions are possible all month until the next reload, even though you knew beforehand

Page 26: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation26

PbD: Information Transfer Accounting

Basic Data

Name: Mark T SmithAddress: POB 1346City: Seattle

Phone: (310) 555-0000

Tax ID: 556-99-9999

Balance: $361.43

Page 27: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation27

PbD: Information Transfer Accounting

Who Looked

Date Name Why01/09/2010 Ken Wales Teller trans11/24/2010 Susan Callie Fraud invest

Page 28: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation28

PbD: Information Transfer Accounting

Sent Where

Date Sent to Why04/19/2010 ADP Payroll synch06/01/2010 Amex Marketing alliance07/16/2010 S&J IncThird party deal12/31/2010 IRS Annual compliance

Page 29: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation29

PbD: Information Transfer Accounting

ABOUT THE FEATURE Can record who inspected each record and record this

with the record, mush like a credit report has a list of recent parties who have inquired

Can record what records were transferred to secondary systems, allowing users to inspect information flows

IMPORTANCE It is often cumbersome to learn who has seen what

records or what records have been shared system-to-system

Users can now be easily provided such disclosures increasing transparency and control e.g., able to recall or cancel information transfers from selected sharing partners

Page 30: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation30

A Wide Number of Privacy by Design Features

Data Tethering

Analytics on Anonymized Data

Tamper Resistant Audit Log

Information Transfer Accounting

Full Attribution

False Negative Favoring

Self-Correcting False Positives

By design

By design

By design

By design

Mandatory

Mandatory

Mandatory

Page 31: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation31

IBM InfoSphere Sensemaking V1.1.0.0

Smarter More Responsible&

Page 32: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation32

IBM InfoSphere Sensemaking V1.1.0.0

Challenge

Try to find another general purpose advanced analytics technology with more

privacy and civil liberties enhancing features baked-in by design!

In this competition everyone wins.

Page 33: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation33

And more likeminded, nifty features to come …

Page 34: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation34

IBM InfoSphere Sensemaking V1.1.0.0

Date of availability: January 28th, 2011 (TODAY!)

~~ Caveat: Limited availability, subject to lab approval ~~

Page 35: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation35

Related Reference Material

Big Data. New Physics.

Decommissioning Data: Destruction of Accountability

Source Attribution, Don’t Leave Home Without It

Data Tethering: Managing the Echo

Out-bound Record-level Accountability in Information Sharing Systems

To Anonymize or Not Anonymize, That is the Question

Immutable Audit Logs (IAL’s)

Big Data Flows vs. Wicked Leaks

Page 36: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation36

Privacy-Enhancing Technology, State of the Union

Yesterday: Stand-alone privacy-enhancing technologies

– Exist

– If cost extra, adoption is low and slow

– Some researchers wander off – placing attention elsewhere

Today: Privacy by Design– Baked in

– No additional cost

– Some privacy and civil liberties enhancing functionality can even be embedded without an off switch

Page 37: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation37

Finally …

Privacy by design is more than just technology.

Equal, if not more attention, must be placed on privacy by design when conceiving process and policy.

Page 38: © 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011

© 2011 IBM Corporation38

Privacy by Design (PbD)Confessions of an Architect

Privacy by Design | Time to Take ControlToronto, Canada

January 28th, 2011

Jeff Jonas, IBM Distinguished EngineerChief Scientist, IBM Entity Analytics

[email protected]