accelerate responsible clinical trials data sharing while safeguarding participant privacy

53
www.privacyanalytics.ca | 855.686.4781 [email protected] 251 Laurier Avenue, Suite 200 Ottawa, Ontario, Canada K1P 5J6 WEBINAR: Accelerate Responsible Clinical Trials Data Sharing While Safeguarding Participant Privacy

Upload: privacy-analytics

Post on 24-Jan-2015

85 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

www.privacyanalytics.ca | [email protected]

251 Laurier Avenue, Suite 200Ottawa, Ontario, Canada K1P 5J6

WEBINAR: Accelerate Responsible Clinical Trials Data

Sharing While Safeguarding Participant Privacy

Page 2: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 2

Presenters

Chris Wright, Vice President, Marketing and

Today’s Moderator, Privacy Analytics, Inc.

Dr. Khaled El Emam, CEO and founder of

Privacy Analytics, Inc.

Page 3: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 3

Presenter

Chris Wright, Vice President, Marketing and

Today’s Moderator, Privacy Analytics, Inc.

Page 4: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 4

1. Please be sure to mute your phones

2. We’ll have a Q&A after the webinar. Please craft your questions in the dialogue box you see to your right

3. And we’re giving away copies of our Anonymizing Health Data. Please click the link below to fill out the form. We’ll send the presentation to everyone after the webinar

Some Housecleaning

http://info.privacyanalytics.ca/anonymizinghealthcaredata.html

Page 5: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 5

1. Overview of Privacy Analytics

2. Background on clinical trials transparency

3. Special considerations when anonymizing clinical trials data

4. A risk-based methodology for data anonymization

Agenda

Page 6: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 6

About Privacy Analytics

For organizations that want to safeguard and enable their data for

secondary use …

• Software that automates the de-identification

and masking of data using a risk-based

approach to anonymize personal information

• Integrated capabilities to anonymize

structured and unstructured data from

multiple sources

• Peer-reviewed methodologies and value-

added services that certify data as de-

identified using the expert statistical method

under HIPAA

Page 7: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 7

Presenter

Dr. Khaled El Emam, CEO and founder of

Privacy Analytics, Inc.

Page 8: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 8

1. Overview of Privacy Analytics

2. Background on clinical trials transparency

3. Special considerations when anonymizing clinical trials data

4. A risk-based methodology for data anonymization

Agenda

Page 9: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 9

Industry Principles

Page 10: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 10

• 30 April 2013: Final advice to the European Medicines

Agency from the clinical trial advisory group on protecting

patient confidentiality

• 24 June 2013: Publication and access to clinical trials data

(draft policy)

• 14 May 2014: Finalisation of EMA policy on publication of

and access to clinical trial data

• 12 June 2014: European Medicines Agency agrees policy on

publication of clinical trial data with more user-friendly

amendments

Page 11: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 11

“Adequately de-identified data

should be made available for wide

access”

Page 12: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 12

Page 13: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 13

What About the FDA ?

Page 14: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 14

Direct & Quasi-identifiers

Examples of direct identifiers: Name, address, telephone

number, fax number, MRN, health card number, health plan

beneficiary number, VID, license plate number, email address,

photograph, biometrics, SSN, SIN, device number, clinical trial

record number

Examples of quasi-identifiers: sex, date of birth or age,

geographic locations (such as postal codes, census geography,

information about proximity to known or unique landmarks),

language spoken at home, ethnic origin, total years of

schooling, marital status, criminal history, total income, visible

minority status, profession, event dates, number of children,

high level diagnoses and procedures

Page 15: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 15

Anonymization Landscape

Page 16: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 16

De-identification Standards

Page 17: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 17

HIPAA Safe Harbor Method

Safe Harbor Direct Identifiers and Quasi-identifiers

1. Names

2. ZIP Codes (except

first three)

3. All elements of dates

(except year)

4. Telephone numbers

5. Fax numbers

6. Electronic mail

addresses

7. Social security

numbers

8. Medical record

numbers

9. Health plan

beneficiary numbers

10.Account numbers

11.Certificate/license

numbers

12.Vehicle identifiers

and serial numbers,

including license

plate numbers

13.Device identifiers

and serial numbers

14.Web Universal

Resource Locators

(URLs)

15. Internet Protocol (IP)

address numbers

16.Biometric identifiers,

including finger and

voice prints

17.Full face

photographic images

and any comparable

images;

18. Any other unique

identifying number,

characteristic, or

code

Page 18: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 18

Safe Harbor Implementations - I

Page 19: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 19

Safe Harbor Implementations - II

Page 20: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 20

Expert Determination (Statistical) Method

• A person with appropriate knowledge of and experience

with generally accepted statistical and scientific principles

and methods for rendering information not individually

identifiable:

I. Applying such principles and methods; determines that the risk is

“very small” that the information could be used, alone or in

combination with other reasonably available information by an

anticipated recipient to identify an individual who is a subject of the

information; and

II. Documents the methods and results of the analysis that justify such

determination

Page 21: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 21

Section Takeaways

• European regulators are

moving in the direction of

requiring clinical trials data

release

• In two stages: redacted CSRs

and then data

• Industry is taking the

initiative to develop

mechanism for data sharing

already

• There is a dearth of good

standards to address privacy

concerns

Current Status

Page 22: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 22

1. Overview of Privacy Analytics

2. Background on clinical trials transparency

3. Special considerations when anonymizing clinical trials data

4. A risk-based methodology for data anonymization

Agenda

Page 23: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 23

Anonymization Approaches

• Microdata release: individual-level participant data (IPD) is

being provided to data recipients as flat files (CSV or SAS) or

database files

– Microdata can be public or available through controlled

access

• Online portal: data recipients can access IPD through a

portal and perform their analysis through the portal only

– No raw data download allowed (different control

mechanisms used)

– Online portal registration can be public or through a

qualification process

Page 24: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 24

No Zero Risk

Page 25: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 25

Anonymizing Portal Access

• Is it necessary to anonymize data if it is on a portal ?

– There are three types of attack:

• Deliberate attack by recipient – manage that risk

through contracts and audit trails

• Data breach – managed by manufacturer through

portal controls

• Inadvertent re-identification – could happen if data

recipient lives in the same geography as some the

participants

– It is inadvertent disclosure risk that needs to be

managed in a portal – anonymization is still needed

Page 26: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 26

Rare Diseases

• Clinical trials on participants with rare diseases have very

small cohorts – can that data be anonymized ?

• This depends on a number of factors:

– Whether the trial participants represent a fraction of all

patients in the relevant geographies with the disease

– Whether the rare disease is visible or not

– Whether an adversary would know if someone has a

rare disease

– Whether a portal is used or not

• It should not be taken for granted that it is not possible to

anonymize rare disease trials

Page 27: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 27

Data Quality Balance

Page 28: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 28

Replicating Results

• Disclosed data should replicate the results of any published

studies from the clinical trial

• This imposes a stringent standard on any anonymization

techniques that are used

• It would be challenging for a manufacturer if it was not

possible to replicate the results from published studies

Page 29: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 29

What to Expect When Anonymizing

• With sophisticated anonymization techniques, the

anonymized data analysis will replicate the conclusions but

not necessarily the exact values

• With basic anonymization techniques, the conclusions may

not be replicated

Page 30: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 30

Anonymizing Dates

• Can convert all dates to intervals from enrollment

• However, if the enrollment period was short then reversing

a range of possible enrollment dates may be plausible

– That risk should be measured rather than assumed

– Will depend on whether geography is also known

• Date shifting is another scheme which allows the disclosure

of precise dates and can still provides assurances about re-

identification risk

Page 31: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 31

Anonymizing Patient Locations

• Most clinical trials do not collect that information for

analysis purposes

• However, if that information is needed then geo-clustering

of ZIP/postal codes is a good technique for protecting

location information

• It maintains geospatial specificity

Page 32: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 32

Poor Selection of Pseudonyms

Page 33: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 33

Releasing Site Details

• Replacing the site name with an ID may not always be effective

• The highest recruiting sites are likely knowable from clinicaltrials.gov or equivalent registries

• A frequency analysis on the data would reveal which site was the highest recruiting (especially if country information is provided)

• The risk is from geoproxy attacks – many participants will seek care in facilities close to where they live

• For a nontrivial percentage of participants, it may be possible to predict their residence location with some accuracy

Page 34: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 34

Public IPD?

• Public IPD will be challenging to anonymize adequately and

ensure exact replication of published results

• Public IPD is still useful with that caveat – may be good for

summary statistics and the investigation of basic

relationships

• Therefore this should not be discounted

• Needs to be augmented with other data release methods

that would allow the disclosure of more detailed data

Page 35: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 35

Data Release Strategy

• Strategy 1:

– When a data request is received, the data set is

anonymized to specifically meet the data request

– Must be repeated for all data requests

• Strategy 2:

– Create one anonymized data set for each trial and

irrespective of the data request, the same complete

anonymized data set is released

– Much more cost effective, but probably provides more

data than is needed

Page 36: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 36

The Importance of Governance

• More than just technical approaches are needed

• Governance necessary for:

– Tracking data users

– Stigmatizing analytics reviews

– Audits where necessary

– Review of anonymization practices

– Monitoring legislative and regulatory environment

Page 37: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 37

Section Takeaways

Special Considerations

• Multiple approaches to

releasing IPD

• Challenges releasing high

quality public IPD

• Sophisticated

anonymization techniques

are needed to ensure data

quality

• Governance also needed (as

well as technical

approaches)

• European regulators are

moving in the direction of

requiring clinical trials data

release

• In two stages: redacted CSRs

and then data

• Industry is taking the

initiative to develop

mechanism for data sharing

already

• There is a dearth of good

standards to address privacy

concerns

Current Status

Page 38: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 38

1. Overview of Privacy Analytics

2. Background on clinical trials transparency

3. Special considerations when anonymizing clinical trials data

4. A risk-based methodology for data anonymization

Agenda

Page 39: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc.

Identifiability Spectrum

Little De-identification Significant De-identification

5

20

3

2

10

811

16

A range of operational precedents exist based on the situational

context of the data’s use and available mitigating controls that

protect it.

Page 40: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc.

Re-identification Risk: Example

DIRECT IDENTIFIERS INDIRECT IDENTIFIERS SENSITIVE VARIABLES OTHER

ID Name Telephone No. Sex Year of Birth Lab TestLab

Result

Pay

Delay

1 John Smith (412) 668-5468 M 1959 Albumin, Serum 4.8 37

2 Alan Smith (413) 822-5074 M 1969 Creatine Kinase 86 36

3 Alice Brown (416) 886-5314 F 1955 Alkaline Phosphatase 66 52

4 Hercules Green (613)763-5254 M 1959 Bilirubin <0 36

5 Alicia Freds (613) 586-6222 F 1942 BUN/Creatinine Ratio 17 82

6 Gill Stringer (954) 699-5423 F 1975 Calcium, Serum 9.2 34

7 Marie Kirkpatrick (416) 786-6212 F 1966 Free Thyroxine Index 2.7 23

8 Leslie Hall (905) 668-6581 F 1987 Globulin, Total 3.5 9

9 Douglas Henry (416) 423-5965 M 1959 B-type Natriuretic peptide 134 38

10 Fred Thompson (416) 421-7719 M 1967 Creatine Kinase 80 21

3Two quasi-identifiers

matching in three

cells within a dataset

3Two quasi-identifiers

matching in three

cells within a dataset

Page 41: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 41

Little De-identification Significant De-identification

5

20

3

2

10

811

16

Spectrum of Identifiability

Leading research organizations apply these precedents to data release

for secondary purposes. We’ve embedded these precedents into our

software, PARAT CORE.

Page 42: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc.

Managing Re-identification Risk

Page 43: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc.

Complexity Stifles Time to Insight

“… removing patient identifiers and formatting all data sets [ ..] can take up to six months.”

Roche Description of Their Clinical Trials Data Sharing Process for Research Requests

… and the volume of clinical trials data releases will continue to grow rapidly

Page 44: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 44

Automating Anonymization

Page 45: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc.

Reduce Complexity: Accelerate Data Releases

A scalable set of packaged capabilities that enables the release of

anonymized data for analysis quickly, securely and cost-effectively:

Automate

Audit

Analyze

Page 46: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 46

Creating Expertise to Govern Data Releases

• Course on risk-based anonymization (2-day): on-site or

remote

• Exam on body of knowledge and work through case studies

• Maintaining knowledge over time through continuous

education

• Coaching on two data sets

• Requires automated support to operationalize

Page 47: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc.

Challenges:

• Significant size of the data set. Held more than

five years of clinical, prescription, laboratory,

scheduling and billing data of patients

• Numerous release requests from more than

2664 clinics and 5850 physicians

Post-marketing Surveillance

Analytic Outcomes:

De-identified data to analyze:

• Post-marketing surveillance of adverse events

• Public health surveillance

• Prescription pattern analysis

• Health services analysis

� Wanted to anonymize

data on 535,595

patients from general

practices

� Longitudinal data

needed to be used for

on-going and on-

demand analytics

47

Page 48: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 48

GI Protocol

• Two arm protocol; GI events after taking NSAIDs

with and without a PPI

Page 49: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 49

Chlamydia Protocol

• Females 14-24 years old inclusive tested and tested positive

for Chlamydia in the previous 12 months

Page 50: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 50

Section Takeaways

� A risk-based methodology

can be used to release high

quality IPD

� The process can be

automated to accelerate

data release, reduce costs,

ensure consistency, and

provide a defensible result

� Can develop internal

expertise or outsource the

whole data release process

Methodology &

SoftwareSpecial Considerations

• Multiple approaches to

releasing IPD

• Challenges releasing high

quality public IPD

• Sophisticated

anonymization techniques

are needed to ensure data

quality

• Governance also needed (as

well as technical

approaches)

• European regulators are

moving in the direction of

requiring clinical trials data

release

• In two stages: redacted CSRs

and then data

• Industry is taking the

initiative to develop

mechanism for data sharing

already

• There is a dearth of good

standards to address privacy

concerns

Current Status

Page 51: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc.

Balancing Privacy with Data Utility

Data Quality1 Analytic Granularity2 Depth of Insight3

Ensuring de-identified

data has analytic

usefulness by minimizing

the amount of distortion

but still ensure that re-

identification risk is very

small

Allowing users to

configure the extent of

de-identification to match

the characteristics of the

analysis that is

anticipated

Enabling analysis of the

total patient health

experience, to compile a

complete picture of this

experience from multiple

data sources and types

The Analytic Benefits of our Approach

Page 52: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 52

Also, contact me to learn more at [email protected].

We can set up a personalized demo or have a discussion on your

current anonymization needs. Just drop me a line.

We’re giving away copies of our Anonymizing Health Data: http://info.privacyanalytics.ca/anonymizinghealthcaredata.html

Anonymization Survey:

• http://surveys.ronin.com/wix/p1834

200753.aspx?src=1

July 14-16, Health Analytics Expo and Symposium,

Chicago, IL.

Final Thoughts

Page 53: Accelerate responsible clinical trials data sharing while safeguarding participant privacy

© 2014 Privacy Analytics, Inc. 53

Question and Answer

??

?