responsible methods for sharing data

32
251 Laurier Avenue W, Suite 200 Ottawa, ON Canada K1P 5J6 www.privacyanalytics.ca | 855.686.4781 [email protected] Responsible Methods for Sharing Data Khaled El Emam (PhD) De-identification Symposium 21 st October 2014

Upload: privacy-analytics

Post on 14-Jul-2015

105 views

Category:

Data & Analytics


0 download

TRANSCRIPT

251 Laurier Avenue W, Suite 200

Ottawa, ON Canada K1P 5J6

www.privacyanalytics.ca | 855.686.4781

[email protected]

Responsible Methods for Sharing Data

Khaled El Emam (PhD)

De-identification Symposium

21st October 2014

© 2014 Privacy Analytics, Inc.

• Legal or regulatory requirements

• Obtaining patient consent/authorization – not practical for large databases and introduces bias

• Limiting principles / minimal necessary

• Contractual obligations

• Maintain public / consumer / patient trust

• Costs of breach notification

• Rising discipline of re-identificationattacks

Motivations for Anonymization

© 2014 Privacy Analytics, Inc.

Canadian Definitions of Identifiability1

Privacy Law Definition

Ontario

PHIPA

“Identifying information” means information that identifies

an individual or for which it is reasonably foreseeable in

the circumstances that it could be utilized, either alone or

with other information, to identify an individual.

Nfld PPHI “Identifying information” means information that identifies

an individual or for which it is reasonably foreseeable in

the circumstances that it could be utilized either alone or

together with other information to identify an individual.

Sask HIPA “De-identified personal health information” means

personal health information from which any information

that may reasonably be expected to identify an individual

has been removed.

© 2014 Privacy Analytics, Inc.

Canadian Definitions of Identifiability1

Privacy Law Definition

Alberta HIA “Individually identifying” means that the identity of the individual

who is the subject of the information can be readily ascertained

from the information; “nonidentifying” means that the identity of

the individual who is the subject of the information cannot be

readily ascertained from the information.

NB PPIA “Identifiable individual” means an individual can be identified by

the contents of the information because the information includes

the individual’s name, makes the individual’s identity obvious, or

is likely in the circumstances to be combined with other

information that includes the individual’s name or makes the

individual’s identity obvious.

© 2014 Privacy Analytics, Inc.

Existing De-identification Standards

© 2014 Privacy Analytics, Inc.

Resources

© 2014 Privacy Analytics, Inc.

• Two day course on risk management when sharing data will be provided by Ryerson (February 2015)

• Privacy Analytics launching an on-line exam for anonymization professionals (November 2014)

• On-going educational and professional development opportunities in this area through Privacy Analytics

• Methodology manuals being developed for specific types of data, e.g., clinical trials and geospatial data

Anonymization Education & Credentials

© 2014 Privacy Analytics, Inc.

© 2014 Privacy Analytics, Inc.

• Some states release or sell their hospital discharge database for free or a small fee

• Information about medical incidents that were published in newspapers are matched with the White Pages and the publicly available state hospital discharge database

State Discharge Databases Attack - I1

© 2014 Privacy Analytics, Inc.

State Discharge Databases Attack - II1

© 2014 Privacy Analytics, Inc.

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0028071

© 2014 Privacy Analytics, Inc.

© 2014 Privacy Analytics, Inc.

2

© 2014 Privacy Analytics, Inc.

2

© 2014 Privacy Analytics, Inc.

2

© 2014 Privacy Analytics, Inc.

2

© 2014 Privacy Analytics, Inc.

Anonymization = Risk Management

© 2014 Privacy Analytics, Inc.

No Zero Risk

© 2014 Privacy Analytics, Inc.

Direct & Quasi-Identifiers

Examples of direct identifiers: Name, address, telephone number, fax number, MRN, health card number, health plan beneficiary number, VID, license plate number, email address, photograph, biometrics, SSN, SIN, device number, clinical trial record number

Examples of quasi-identifiers: sex, date of birth or age, geographic locations (such as postal codes, census geography, information about proximity to known or unique landmarks), language spoken at home, ethnic origin, total years of schooling, marital status, criminal history, total income, visible minority status, profession, event dates, number of children, high level diagnoses and procedures

© 2014 Privacy Analytics, Inc.

2Anonymization Landscape

© 2014 Privacy Analytics, Inc.

Spectrum of Identifiability

Little De-identification Significant De-identification

5

20

3

2

10

811

16

There are a range of operational precedents, based on situational context and mitigating controls.

© 2014 Privacy Analytics, Inc.

Managing Re-identification Risk

© 2014 Privacy Analytics, Inc.

Automation

© 2014 Privacy Analytics, Inc.

Certification

PRIVACYANALYTICS.CA

© 2012-2013, Privacy Analytics. All Rights Reserved27 of 73

© 2014 Privacy Analytics, Inc.

Large EMR Vendor

De-identified data would allow:

1. Post-marketing surveillance of adverse

events

2. Public health surveillance

3. Prescription pattern analysis

4. Health services analysis

PARAT CORE

PARAT integrated in ETL pipeline

Challenge

Why Privacy Analytics

Solution

EMR vendor with more than 2664

clinics and 5850 physicians using the

system in family clinics and walk-in

clinics. The data set spans more than

five years of all clinical, prescription,

laboratory, scheduling and billing

data.

Customer Profile

Wants to anonymize data on 535,595 patients

from general practices

Longitudinal data needs to be used for on-going

and on-demand analytics

Enabling Post-marketing and Public Health Surveillance

© 2014 Privacy Analytics, Inc.

• Two arm protocol; GI events after taking NSAIDs with and without a PPI

GI Protocol

© 2014 Privacy Analytics, Inc.

• Females 14-24 years old inclusive tested and tested positive for Chlamydia in the previous 12 months

Chlamidya Protocol

© 2014 Privacy Analytics, Inc.

[email protected]

@kelemam

www.privacyanalytics.ca

Contact

© 2014 Privacy Analytics, Inc.

QUESTIONS?