privacy implications of online data collection

33
Privacy Privacy Implications of Implications of Online Data Online Data Collection Collection Lorrie Faith Cranor AT&T Labs-Research http://www.research.att.com/~lorrie/ DIMACS Workshop DIMACS Workshop

Upload: glora

Post on 25-Feb-2016

34 views

Category:

Documents


1 download

DESCRIPTION

Privacy Implications of Online Data Collection. DIMACS Workshop. Lorrie Faith Cranor AT&T Labs-Research http://www.research.att.com/~lorrie/. Recent headlines. Activists charge DoubleClick double cross. Websites Pull Back From Doubleclick. Doubleclick shelves plan to tag Web surfers. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Privacy Implications of Online Data Collection

Privacy Privacy Implications of Implications of

Online Data Online Data CollectionCollectionLorrie Faith Cranor

AT&T Labs-Research

http://www.research.att.com/~lorrie/

DIMACS WorkshopDIMACS Workshop

Page 2: Privacy Implications of Online Data Collection

2

Recent headlinesRecent headlines

Doubleclick shelves plan to tag Web surfers

Clinton Issues Privacy Warning To Technology Leaders

Websites Pull Back From Doubleclick

Senators Raise Privacy Issue In AOL-Time Warner Hearing

Activists charge DoubleClick double cross

Page 3: Privacy Implications of Online Data Collection

3

Online profiling in the Online profiling in the comics!comics!

Cathy March 1, 2000

Page 4: Privacy Implications of Online Data Collection

4

How do they get my data?How do they get my data? Browsers advertise

IP address, domain name, organization, referring pageplatform: O/S, browser which information is requested

Information available toend serverslocal system administratorsother third parties (e.g., doubleclick.com)

Cookies, Web bugs, advertising networks

Page 5: Privacy Implications of Online Data Collection

5

Browsers like to chatterBrowsers like to chatter

A typical HTTP requestGET http://www.amazon.com/ HTTP/1.0User-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4m)Host: www.amazon.comReferer: http://www.alcoholics-anonymous.org/Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*Cookie: session-id-time=868867200; session-id=6828-2461327-

649945; group_discount_cookie=F

Page 6: Privacy Implications of Online Data Collection

6

Servers record what they Servers record what they hearhear

Server logsstore host, time, date, requested URL, referrerppp.bu.edu - - [09/Dec/1996:20:33:22 -500] “Get /cgi-bin/wwwais?hemoglobin+geneHTTP/1.0” 200 527

affiliation: Boston University, probably working from home, probably student or faculty in biology

Page 7: Privacy Implications of Online Data Collection

7

What about cookies?What about cookies? Cookies can be useful

used like a staple to attach multiple parts of a form together

used to identify you when you return to a web site so you don’t have to remember a password

used to help web sites understand how people use them

Cookies can be harmfulused to profile users and track their activities,

especially across web sites

Page 8: Privacy Implications of Online Data Collection

8

YOU

Searchengine

Ad

Search formedical

information

BookStore

Ad

Buy book

Ad companycan get yourname and

address frombook order and

link them to your search

Readcookie

Setcookie

Page 9: Privacy Implications of Online Data Collection

9

Referer log problemsReferer log problemsGET methods result in values in URLThese URLs are sent in the referer

header to next hostExample: http://www.merchant.com/cgi_bin/order?

name=Tom+Jones&address=here+there&credit+card=234876923234&PIN=1234& -> index.html

Page 10: Privacy Implications of Online Data Collection

10

What DoubleClick knows…What DoubleClick knows…… about Richard M. Smith Personal data:

My Email address My full name My mailing address (street, city, state, and Zip code) My phone number

Transactional data: Names of VHS movies I am interesting in buying Details of a plane trip Search phrases used at search engines Health conditions

Page 11: Privacy Implications of Online Data Collection

11

No clicks requiredNo clicks required

“It was not necessary for me to click on the banner ads for information to be sent to DoubleClick servers.”

– Richard M. Smithhttp://www.tiac.net/users/smiths/privacy/banads.htm

Page 12: Privacy Implications of Online Data Collection

12

DoubleClick examplesDoubleClick examplesAltaVista Yellow Pages – Complete home address (Fixed January 2000)

Banner ad URL: http://live.av.com/scripts/search.dll?ep=7&gca=address&orderby=distance&sstreet=172+mason+terr&scity=brookline&sstate=MA&szip=02446&scountry=USA&query=sinsa&qname=&sic=&ck=&userid=130782922&userpw=.&uh=130782922,0,&ccity=brookline&cstate=MA&ver=hb1.2.2

Travelocity – Email address

Referring URL: http://dps1.travelocity.com/[email protected]

Page 13: Privacy Implications of Online Data Collection

13

Merging online and offline Merging online and offline datadata

In mid-February DoubleClick announced plans to merge “anonymous” online data with personal information obtained from offline databases

By the first week in March the plans were put on hold

Page 14: Privacy Implications of Online Data Collection

14

Public concernPublic concernApril 1997 Louis Harris Poll of Internet

users5% say they have been the victim of an

invasion of privacy while on the Internet53% say they are concerned that information

about which sites they visit will be linked to their email address and disclosed without their knowledge

See also “Beyond Concern” study:http://www.research.att.com/projects/privacystudy/

Page 15: Privacy Implications of Online Data Collection

15

International issuesInternational issuesEuropean Union Data Directive

prohibits secondary uses of data without informed consentCreating personally-identifiable online profiles

will have to be opt-in in most casesUpfront notice must be given when data is

collected – no web bugsNo transfer of data to non-EU countries unless

there is adequate privacy protection

Page 16: Privacy Implications of Online Data Collection

16

Children's issuesChildren's issuesChildren’s Online Privacy Protection

Act (COPPA) requires parental consent before collecting personally-identifiable data from children online

Page 17: Privacy Implications of Online Data Collection

17

SubpoenasSubpoenasData on online activities is increasingly

of interest in civil and criminal casesThe only way to avoid subpoenas is to

not have dataYour files on your computer in your

home have much greater legal protection that your files stored on a server on the network

Page 18: Privacy Implications of Online Data Collection

18

Privacy concernsPrivacy concerns Data is often collected silently

Web allows lots of data to be collected easily, cheaply, unobtrusively and automatically

Individuals not given meaningful choice

Data from many sources may be mergedEven non-identifiable daa can become identifiable

when merged

Data collected for business purposes may be used in civil and criminal proceedings

Page 19: Privacy Implications of Online Data Collection

19

Some solutionsSome solutionsPrivacy policiesVoluntary guidelines and codes of

conductSeal programsInfomediariesTechnologies for facilitating notice and

choiceP3P

Page 20: Privacy Implications of Online Data Collection

20

P3P1.0 – A First StepP3P1.0 – A First StepOffers an easy way for web sites to

communicate about their privacy policies in a standard machine-readable formatCan be deployed using existing web servers

This will enable users to use tools that:Display symbols, play sounds, or provide

snapshots of sites’ policiesDisplay symbols or prompts after comparing

policies with user preferences

Page 21: Privacy Implications of Online Data Collection

21

P3P is a Partial SolutionP3P is a Partial Solution P3P1.0 helps users understand privacy

policies but is not a complete solution Seal programs and regulations help ensure

that sites comply with their policies Anonymity tools reduce the amount of

information revealed while browsing Encryption tools secure data in transit and

storage Laws and codes of practice provide a base

line level for acceptable policies

Page 22: Privacy Implications of Online Data Collection

22

Implementing a P3P 1.0 Implementing a P3P 1.0 ServerServer

Formulate privacy policy Translate privacy policy into P3P format Place P3P policy on web site

One policy for entire site or multiple policies for different parts of the site

Associate policy with web resources: Configure server to insert P3P header with link to

P3P policy; or Insert link to P3P policy in HTML content

Page 23: Privacy Implications of Online Data Collection

23

A simple HTTP transactionA simple HTTP transactionWeb

ServerGET /x.html HTTP/1.1Host: foo.com. . . Request web page

HTTP/1.1 200 OKContent-Type: text/html. . . Send web page

Page 24: Privacy Implications of Online Data Collection

24

HTTP/1.1 200 OKContent-Type: text/html. . . Send web page

A simple HTTP transactionA simple HTTP transactionWeb

ServerWith P3P 1.0 added

GET /x.html HTTP/1.1Host: foo.com. . . Request web page

HTTP/1.1 200 OKOpt: http://www.w3.org/2000/P3Pv1/; ns=1111-Policy: http://foo.com/p3p.xmlContent-Type: text/html. . . Send web page

GET /p3p.xml HTTP/1.1Host: foo.com. . . Request P3P Policy

HTTP/1.1 200 OK. . . Send P3P Policy

Page 25: Privacy Implications of Online Data Collection

25

Implementing a P3P1.0 Implementing a P3P1.0 ClientClient

Client can be implemented as browser, proxy, plugg-in, part of an electronic wallet, java applet, javascript, etc.Can be entirely server side

Look for link to P3P policy and fetch policy with HTTP GET request

Parse policy and take appropriate actionDisplay symbol, play sound, prompt user, etc.Action can optionally be based on user preferencesAction can optionally allow data to be automatically

filled into form or transferred from electronic wallet

Page 26: Privacy Implications of Online Data Collection

26

Some P3P Client IdeasSome P3P Client Ideas Symbols for how data is

used complete transaction R&D Customization marketing

Symbols to indicate whether data is shared

Symbols to indicate site has privacy seal

Symbols to indicate compliance with laws and regulations complies with German law complies with German law

if user gives informed consent

does not comply with German law

Symbols to indicate match/mismatch with user preferences information about cause of

mismatch on mouse-over

Page 27: Privacy Implications of Online Data Collection

27

P3P PoliciesP3P Policies Machine-readable (XML) version of web site

privacy policies Use P3P Vocabulary to express data

practices Use P3P Base Data Set to express type of

data collected Capture common elements of privacy policies

but may not express everything (sites may provide further explanation in human-readable policies)

Page 28: Privacy Implications of Online Data Collection

28

The P3P VocabularyThe P3P Vocabulary Who is collecting data? What data is collected? For what purpose will

data be used? Is there an ability to

change preferences about (opt-in or opt-out) of some data uses?

Who are the data recipients (anyone beyond the data collector)?

To what information does the data collector provide access?

What is the data retention policy?

How will disputes about the policy be resolved?

Where is the human-readable privacy policy?

Page 29: Privacy Implications of Online Data Collection

29

Example Privacy PolicyExample Privacy PolicyTheCoolCatalog of 123 Main Street, Bethesda, MD 20814, USA, makes

the following statement for the Web page at http://www.TheCoolCatalog.com/catalog/. We have a privacy seal from PrivacySeal.org. Our privacy policy is posted at http://www.TheCoolCatalog.com/PrivacyPractice.html. We do not provide access capabilities to information we have about you.

We use cookies and collect your gender, information about your clothing preferences, and (optionally) your home address to customize our entry catalog pages and for our own research and product development. We retain this information indefinitely.

We also maintain server logs that include information about visits to the http://www.TheCoolCatalog.com/catalog/ page, and the types of browsers our visitors use. We use this information in order to maintain and improve our web site. We retain this information indefinitely.

Page 30: Privacy Implications of Online Data Collection

P3P/XML EncodingP3P/XML Encoding<POLICY xmlns="http://www.w3.org/2000/P3Pv1" entity=“TheCoolCatalog, 123 Main Street, Bethesda, MD 20814, USA"> <DISPUTES-GROUP><DISPUTES resolution-type="independent" service="http://www.PrivacySeal.org" description="PrivacySeal.org" image="http://www.PrivacySeal.org/Logo.gif"/></DISPUTES-GROUP> <DISCLOSURE discuri="http://www.TheCoolCatalog.com/PrivacyPractice.html" access="none"/> <STATEMENT> <CONSEQUENCE-GROUP><CONSEQUENCE>a site with clothes you would appreciate</CONSEQUENCE></CONSEQUENCE-GROUP> <RECIPIENT><ours/></RECIPIENT> <PURPOSE><custom/><develop/></PURPOSE> <RETENTION><indefinitely/></RETENTION> <DATA-GROUP> <DATA name="dynamic.cookies" category="state"/> <DATA name="dynamic.miscdata" category="preference"/> <DATA name="user.gender"/> <DATA name="user.home." optional="yes"/> </DATA-GROUP> </STATEMENT> <STATEMENT> <RECIPIENT><ours/></RECIPIENT> <PURPOSE><admin/><develop/></PURPOSE> <RETENTION><indefinitely/></RETENTION> <DATA-GROUP> <DATA name="dynamic.clickstream.server"/> <DATA name="dynamic.http.useragent"/> </DATA-GROUP> </STATEMENT></POLICY>

Page 31: Privacy Implications of Online Data Collection

31

PrivacyBank.ComPrivacyBank.Com PrivacyBankbookmark

Page 32: Privacy Implications of Online Data Collection

32

Infomediary example: PrivacyBank

PrivacyBankbookmark

Page 33: Privacy Implications of Online Data Collection

33

ChallengeChallenge Data is useful for research, targeting potential

customers, building relationships with customers, etc.

Privacy laws make data collection more difficult

Data collectors have personal privacy concerns too

How can we collect data in ways that reduce privacy concerns while remaining useful for research and business?