![Page 1: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/1.jpg)
Tragedy of theDeidentified Data
CommonsAn Appeal for Transparency and Access
Jane BambauerJames E. Rogers College of Law
University of Arizona
![Page 2: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/2.jpg)
The Data Commons
Information collected by the governmenttax information, epidemiological data, census surveys,
educational records, home mortgage data
Information collected by private companies
Anonymized and released*
![Page 3: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/3.jpg)
The Anonymization Problem
• Research subjects can be reidentified in anonymized databases “with astonishing ease.”
AOLRe-identification of Gov. WeldNetflix re-identification
• Every privacy law must be rewritten to eliminate dependence on anonymization and to restrict access to all data (even deidentified data) without consent
Paul Ohm, Broken Promises of Privacy
57 UCLA L. REV. 1701
![Page 4: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/4.jpg)
Save the Data Commons
The Data Commons has been used to:
• Detect housing and employment discrimination• Debunk the myth of the “welfare queen”• Inform the healthcare and
mortgage lending policy debates• Correct longstanding
misconceptions about crime and law enforcement
• Lots more…
Jane Yakowitz, Tragedy of the Data Commons
![Page 5: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/5.jpg)
Hazards of Covert Noise-Adding
![Page 6: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/6.jpg)
Hazards of Covert Noise-Adding
![Page 7: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/7.jpg)
Exaggerated Risks of ReidentificationThe Gov. Weld Example
![Page 8: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/8.jpg)
Exaggerated Risks of ReidentificationThe Gov. Weld Example
![Page 9: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/9.jpg)
Exaggerated Risks of ReidentificationThe Gov. Weld Example
![Page 10: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/10.jpg)
Gov. Weld Reidentification
Latanya Sweeney Collected Gov. Weld’s voter registration information and publicly available hospital data
Only one hospital patient matched Gov. Weld’s DOB, zip, and gender
Conclusion from analysis of US Census data:87% can be uniquely identified from DOB, zip, and gender
Golle recalculations:63% are unique using DOB, zip, and gender
![Page 11: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/11.jpg)
Daniel Barth-Jones, “Reidentification” of Governor William Weld
![Page 12: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/12.jpg)
Sweeney et al. 2013 PGP Study
579 Personal Genome Project participants provided their DOB, zip code, and gender
Using voter registration records and other commercial data sources, Sweeney et al. were able to reidentify 28%(accuracy unclear)
![Page 13: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/13.jpg)
2009 ONC Study
Out of 15,000 HIPAA-compliant records, 2 could be reidentified
.013% Chance of Reidentification
For comparison’s sake, chance of dying from an auto accident this year: .017%
![Page 14: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/14.jpg)
Total Number of Known Malicious Reidentifications
0 or 1*
![Page 15: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/15.jpg)
If I Were a Malicious Intruder…
3,101 reported data breaches in the U.S.
(about half a billion records)
700 reported breaches of health records
![Page 16: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/16.jpg)
If I Were a Malicious Intruder…
Sift through GarbageMake Inferences from Facebook ProfilesSwab a Coffee Cup
![Page 17: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/17.jpg)
What We Have to Lose• Fewer Opportunities for Replication• Fewer Voluntary Research Databases• Fewer Involuntary Public Databases• Increased Regulatory Precautions
More Status Quo Bias
![Page 18: Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access Jane Bambauer James E. Rogers College of Law University of Arizona](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e845503460f94b862c8/html5/thumbnails/18.jpg)
Vioxx “What If” Study
From Richard Platt’s FDA testimony in 2007
Vioxx approved May, 1999Removed from market September, 2004 (64 months)
Data on 7 million patients: 34 months
Data on 100 million: 3 months
88,000-139,000 avoidable heart attacks27,000-55,000 avoidable deaths