zimmer sachrp slides v2

29
Presentation Author, 2006 Research Ethics in the 2.0 Era: Conceptual Gaps for Ethicists, Researchers, IRBs Michael Zimmer, PhD School of Information Studies University of Wisconsin- Milwaukee [email protected] http://michaelzimmer.org Secretary’s Advisory Committee on Human Research

Upload: michael-zimmer

Post on 01-Dec-2014

1.706 views

Category:

Education


1 download

DESCRIPTION

On Wednesday, July 21, 2010, I will be presenting in front of the Secretary’s Advisory Committee on Human Research Protections (SACHRP), part of the Office for Human Research Protections in the United States Department of Health and Human Services (HHS). My presentation will focus on how Web 2.0 tools, environments, and experiences are creating new conceptual gaps in our understanding of privacy, anonymity/identifiability, consent, and harm.

TRANSCRIPT

Page 1: Zimmer sachrp slides v2

Presentation Author, 2006

Research Ethics in the 2.0 Era:Conceptual Gaps for Ethicists, Researchers, IRBs

Michael Zimmer, PhDSchool of Information Studies

University of [email protected]

http://michaelzimmer.org

Secretary’s Advisory Committee on Human Research Protections

July 21, 2010

Page 2: Zimmer sachrp slides v2

My Perspective

• Approaching the problem of “The Internet in Human Subjects Research” from the field of information ethics

• Focus on how 2.0 tools, environments, and experiences are creating new conceptual gaps in our understanding of:– Privacy– Anonymity vs. Identifiability– Consent– Harm

Page 3: Zimmer sachrp slides v2

Illuminating Cases

1. Tastes, Ties, and Time (T3) Facebook data release

2. Pete Warden’s harvesting (and proposed release) of public Facebook profiles

3. Question of consent for using “public” Twitter streams

4. Library of Congress archiving “public” Twitter streams

Page 4: Zimmer sachrp slides v2

T3 Facebook Project

• Harvard-based Tastes, Ties, and Time (T3) research project sought to understand social network dynamics of large groups of students

• Solution: Work with Facebook & an “anonymous” university to harvest the Facebook profiles of an entire cohort of college freshmen– Repeat each year for their 4-year tenure– Co-mingle with other University data

(housing, major, etc)– Coded for race, gender, political views,

cultural tastes, etc

Page 5: Zimmer sachrp slides v2

T3 Data Release

• As an NSF-funded project, the dataset was made publicly available– First phase released September 25, 2008– One year of data (n=1,640)– Prospective users must submit application

to gain access to dataset– Detailed codebook available for anyone to

access

Page 6: Zimmer sachrp slides v2

“Anonymity” of the T3 Dataset

• But dataset had unique cases (based on codebook)

• If we could identify the source university, individuals could potentially be identified– Took me minimal effort to discern the

source was Harvard

• The anonymity (and privacy) of subjects in the study might be in jeopardy….

“All the data is cleaned so you can’t connect anyone to an identity”

Page 7: Zimmer sachrp slides v2

T3 Good-Faith Efforts to Protect Subject Privacy

1. Only those data that were accessible by default by each RA were collected

2. Removing/encoding of “identifying” information

3. Tastes & interests (“cultural footprints”) will only be released after “substantial delay”

4. To download, must agree to “Terms and Conditions of Use” statement

5. Reviewed & approved by Harvard’s IRB

Zimmer, M. (2010). “But the data is already public”: on the ethics of research in Facebook. Ethics & Information Technology

Page 8: Zimmer sachrp slides v2

1. Only those data that were accessible by default by each RA were collected

• False assumption that because the RA could access the profile, it was “publicly available”

• RAs were Harvard graduate students, and thus part of the the “Harvard network” on Facebook

“We have not accessed any information not otherwise available on Facebook”

Page 9: Zimmer sachrp slides v2

2. Removing/encoding of “identifying” information

• While names, birthdates, and e-mails were removed…

• Various other potentially “identifying” information remained – Ethnicity, home country/state, major, etc

• AOL/NetFlix cases taught us how nearly any data could be potentially “identifying”

“All identifying information was deleted or encoded immediately after the data were downloaded”

Page 10: Zimmer sachrp slides v2

3. Tastes & interests will only be released after “substantial delay”

• Individuals might be uniquely identified by what they list as a favorite book, movie, restaurant, etc.

• Steps taken to mitigate this privacy risk:– In initial release, cultural taste labels

assigned random numbers– Actual labels to be released after a

“substantial delay” – 3 years later

T3 researchers recognize the unique nature of the cultural taste labels: “cultural fingerprints”

Page 11: Zimmer sachrp slides v2

3. Tastes & interests will only be released after “substantial delay”

• But, is 3 years really a “substantial delay”?– Subjects’ privacy expectations don’t expire

after artificially-imposed timeframe– Datasets like these are often used years

after their initial release, so the delay is largely irrelevant

• T3 researchers also will provide immediate access on a “case-by-case” basis– No details given, but seemingly contradicts

any stated concern over protecting subject privacy

Page 12: Zimmer sachrp slides v2

4. “Terms and Conditions of Use” statement

3. I will use the dataset solely for statistical analysis and reporting of aggregated information, and not for investigation of specific individuals….

4. I will produce no links…among the data and other datasets that could identify individuals…

6. I will not knowingly divulge any information that could be used to identify individual participants

7. I will make no use of the identity of any person or establishment discovered inadvertently.

Page 13: Zimmer sachrp slides v2

4. “Terms and Conditions of Use” statement

• The language within the TOS clearly acknowledges the privacy implications of the T3 dataset– Might help raise awareness among

potential researchers; appease IRB

• But “click-wrap” agreements are notoriously ineffective to affect behavior

• Unclear how the T3 researchers specifically intend to monitor or enforce compliance– Already been one research paper that

might violate the TOS

Page 14: Zimmer sachrp slides v2

5. Reviewed & Approved by IRB

“Our IRB helped quite a bit as well. It is their job to insure that subjects’ rights are

respected, and we think we have accomplished this”

“The university in question allowed us to do this and Harvard was on board because we

don’t actually talk to students, we just accessed their Facebook information”

Page 15: Zimmer sachrp slides v2

5. Reviewed & Approved by IRB

• For the IRB, downloading Facebook profile information seemed less invasive than actually talking with subjects…– Did IRB know unique, personal, and

potentially identifiable information was present in the dataset?

• …and consent was not needed since the profiles were “freely available”– But RA access to restricted profiles

complicates this; did IRB contemplate this?– Is putting information on a social network

“consenting” to its use by researchers?

Page 16: Zimmer sachrp slides v2

T3 Good-Faith Efforts to Protect Subject Privacy

1. Only those data that were accessible by default by each RA were collected

2. Removing/encoding of “identifying” information

3. Tastes & interests (“cultural footprints”) will only be released after “substantial delay”

4. To download, must agree to “Terms and Conditions of Use” statement

5. Reviewed & approved by Harvard’s IRB

Zimmer, M. (2010). “But the data is already public”: on the ethics of research in Facebook. Ethics & Information Technology

Page 17: Zimmer sachrp slides v2

Illuminating Cases

1. Tastes, Ties, and Time (T3) Facebook data release

2. Pete Warden’s harvesting (and proposed release) of public Facebook profiles

3. Question of consent for using “public” Twitter streams

4. Library of Congress archiving “public” Twitter streams

Page 18: Zimmer sachrp slides v2

Pete Warden Facebook Dataset

• Exploited flaw in FB’s architecture to access and harvest public profiles to 215 million users (without needing to login)

• Planned to release entire dataset – with all personal information intact – to academic community– Would it be acceptable to use this dataset?– Users made data public, but did they expect

it to be harvested by bots, aggregated, and made available as raw data?

http://michaelzimmer.org/2010/02/12/why-pete-warden-should-not-release-profile-data-on-215-million-facebook-users/

Page 19: Zimmer sachrp slides v2

Harvesting Public Twitter Streams

• Is it ethical for researchers to follow and systematically capture public Twitter streams without first obtaining specific, informed consent by the subjects?– Are tweets publications, or utterances?– Are you reading a text, or recording a

discussion?– What are users’ expectations to how their

tweets are being found & used?– What if user later changes their

preferences, or deletes tweets, etc

http://michaelzimmer.org/2010/02/12/is-it-ethical-to-harvest-public-twitter-accounts-without-consent/

Page 20: Zimmer sachrp slides v2

LOC Archive of Public Tweets

• Library of Congress will archive all public tweets– 6 month delay, restricted access to

researchers

• Open questions:– Can users opt-out from being in permanent

archive?– Can users delete tweets from archive?– Will geolocational and other profile data be

included?– What about a public tweet that is re-

tweeting a private one?

Page 21: Zimmer sachrp slides v2

Illuminating Cases

1. Tastes, Ties, and Time (T3) Facebook data release

2. Pete Warden’s harvesting (and proposed release) of public Facebook profiles

3. Question of consent for using “public” Twitter streams

4. Library of Congress archiving “public” Twitter streams

Page 22: Zimmer sachrp slides v2

Conceptual Gaps

• Focus on how 2.0 tools, environments, and experiences are creating new conceptual gaps in our understanding of:– Privacy– Anonymity vs. Identifiability– Consent– Harm

• Conceptual gaps lead to policy vacuums that need to be addressed…

Page 23: Zimmer sachrp slides v2

Conceptual Gap: Privacy

• Presumption that because subjects make information available on Facebook/Twitter, they don’t have an expectation of privacy– Ignores contextual nature of sharing– Ignores whether users really understand

their privacy settings

• Going forward…– Recognize the strict dichotomy of

public/private doesn’t apply in the 2.0 world– Consider Helen Nissenbaum’s theory of

“contextual integrity” as more fitting rubric– Strive to consult privacy scholars on

projects & reviews

Page 24: Zimmer sachrp slides v2

Conceptual Gap: Anonymity vs. Identifiability

• Presumption that stripping names & other obvious identifiers provides anonymity– Ignores how anything can potentially

identifiable information and become the “missing link” to re-identify an entire dataset

• Going forward– Recognize “personally identifiable

information” is an imperfect concept• Consider EU’s protection of any data

“potentially linkable” to an identity– “Anonymous” datasets are not achievable

and provides false sense of protection• Paul Ohm, “Broken Promises of Privacy”

Page 25: Zimmer sachrp slides v2

Conceptual Gap: Consent

• Presumption that because something is shared, the subject is consenting to it being harvested for research– Undervalues subject's intent– Ignores how research method might allow

un-anticipated access to “restricted” data

• Going forward– Must recognize that a user making

something public online comes with a set of assumptions/expectations about who can access and how

– Does anything outside this need specific consent?

Page 26: Zimmer sachrp slides v2

Conceptual Gap: Harm

• Researchers often imply “data is already public, so what harm could happen”– Ignores dignity & autonomy, let alone

unanticipated consequences

• Going forward– Must move beyond the concept of harm as

requiring a tangible consequence• Protecting from harm is more than protecting

from hackers, spammers, identity thieves, etc– Consider dignity/autonomy theories of harm

• Must a “wrong” occur for there to be damage to the subject?

• Do subjects deserve control over the use of their data streams?

Page 27: Zimmer sachrp slides v2

Conceptual Gaps ==> Policy Vacuums

• Researchers & IRBs are trying to do the right thing when faced with Internet research projects (and usually, they do)

• But the fluidity and complexity of Web 2.0 creates significant conceptual gaps

• Leaving IRBs with policy vacuums– Are IRBs able to sufficiently fill these gaps

and scrutinize innovative internet research with respect to these new complexities?

Page 28: Zimmer sachrp slides v2

How to address the policy vacuums?• Specialized training

– SACHRP should require IRBs (or designated members) to participate in certified workshops on Internet research ethics

• Best practices & guidelines– SACHRP should provide general best

practices & guidelines (or link to existing ones) to help IRBs when confronted with Internet research

• Other collaborations– InternetResearchEthics.org– Digital Media & Learning collaboration

Page 29: Zimmer sachrp slides v2

Presentation Author, 2006

Research Ethics in the 2.0 Era:Conceptual Gaps for Ethicists, Researchers, IRBs

Michael Zimmer, PhDSchool of Information Studies

University of [email protected]

http://michaelzimmer.org

Secretary’s Advisory Committee on Human Research Protections

July 21, 2010