privacy usc csci499

77
PRIVACY USC CSCI499 Dr. Genevieve Bartlett USC/ISI

Upload: naasir

Post on 24-Feb-2016

17 views

Category:

Documents


0 download

DESCRIPTION

Privacy USC CSci499. Dr. Genevieve Bartlett USC/ISI. Privacy. The state or condition of being free from observation. Privacy. The state or condition of being free from observation. Not really possible today…at least not on the internet. Privacy. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Privacy USC CSci499

PRIVACYUSC CSCI499Dr. Genevieve BartlettUSC/ISI

Page 2: Privacy USC CSci499

Privacy The state or condition of being free from

observation.

Page 3: Privacy USC CSci499

Privacy The state or condition of being free from

observation.

Not really possible today…at least not on the internet.

Page 4: Privacy USC CSci499

Privacy The right of people to choose freely

under what circumstances and to what extent they will reveal themselves, their attitude, and their behavior to others.

Page 5: Privacy USC CSci499

Privacy is not black and white Lots of grey areas and points for

discussion What seems private to you may not

seem private to me Three examples to start us off:

HTTP Cookies Google Street View Facebook

Page 6: Privacy USC CSci499

HTTP cookies: What are they? Cookies = small text file Received from a server, stored on your

machine Usually web

Purpose: HTTP is stateless, so cookies maintain state for the HTTP protocol Eg keeping the contents of your “shopping

cart” while you browse a site

Page 7: Privacy USC CSci499

HTTP cookies: 3rd party cookies

You visited your favorite site unicornsareawesome.com

unicornsareawesome.com pulls ads from lameads.com

You get a cookie from lameads.com, even though you never visited lameads.com

lameads.com can track your browsing habits every time you visit any page with ads from lameads.com… those might be a lot of pages

Page 8: Privacy USC CSci499

HTTP cookies: Grey Area? 3rd party cookies allow ad servers to

personalize your ads = more useful to you. Good!

But You choose to go to

unicornsareawesome.com = ok with unicornsareawesome.com knowing about how you use their site

Nowhere did you choose to let lameads.com monitor your browsing habits

Page 9: Privacy USC CSci499

Short Discussion: Collusion: tool to track these 3rd party

cookies TED talk on “Tracking the Trackers”

http://www.ted.com/talks/gary_kovacs_tracking_the_trackers.html

Page 10: Privacy USC CSci499

Google Street View: What is it? Google cars drive around and take360° panoramic pictures. Images are stitched together andcan be browsed through on the Internet

Page 11: Privacy USC CSci499

Google Street View: Me

Page 12: Privacy USC CSci499

Google Street View: Lots to See

Page 13: Privacy USC CSci499

Google Street View: Grey Area Expectation of privacy?

I’m in public, I can expect people will see me

Expectations? Picture linked to location Searchable Widely available Available for a long time to come

Page 14: Privacy USC CSci499

Facebook: What is it? Social networking site

Connect with friends Share pictures, interests (“likes”)

Page 15: Privacy USC CSci499

Facebook: Grey Area Who uses Facebook data and how is data

used? 4.7 million liked a page about health

conditions or treatments. Insurance agents? 4.8 million shared information about dates

of vacations. Burglars? 2.6 million discussed recreational use of

alcohol. Employers?

Page 16: Privacy USC CSci499

Facebook: More Grey Security issues with Facebook Confusion over privacy settings Sudden changes in default privacy

settings Facebook tracks browsing habits, even if

a user isn’t logged in (third-party cookies)

Facebook sells user information to ad agencies and behavioral trackers

Page 17: Privacy USC CSci499
Page 18: Privacy USC CSci499
Page 19: Privacy USC CSci499

Why start with these examples? 3 examples: HTTP cookies, Google Street

View, Facebook Lots more “every day” examples

Users gain benefits by sharing data Tons of data generated, widely shared

and accessible and stored (for how long?)

Are users really aware of how and who?

Page 20: Privacy USC CSci499

Today’s Agenda Privacy and Privacy & Security How do we “safely” share private data? Privacy and Inferred Information Privacy and Social Networks How do we design a system with privacy

in mind?

Page 21: Privacy USC CSci499

Privacy and Privacy & Security How do we “safely” share private data? Privacy and Inferred Information Privacy and Social Networks How do we design a system with privacy

in mind?

Page 22: Privacy USC CSci499

Examples private information Tons of information can be gained from Internet use:

Behavior Eg. Person X reads reddit.com at work.

Preferences Eg. Person Y likes high heel shoes and uses Apple products.

Associations Eg. Person X and Person Y are friends.

PPI (private, personal/protected information) credit card #s, SSN, nick names, addresses

PII (personally identifying information) Eg. Your age + your address = I know who you are, even if I’m

not given your name.

Page 23: Privacy USC CSci499

How do we achieve privacy? policy + security mechanisms + law + ethics + trust Anonymity & Anonymization

mechanisms Make each user indistinguishable from the

next Remove PPI & PII Aggregate information

Page 24: Privacy USC CSci499

Who wants private info? Governments – surveillance Businesses – targeted advertising,

following trends Attackers – monetize information or

cause havoc Researchers – medical, behavioral,

social, computer

Page 25: Privacy USC CSci499

Who has private info? You and me

End-users Customers Patients

Businesses Protect mergers, product plans,

investigations Government & law enforcement

National security Criminal investigations

Page 26: Privacy USC CSci499

Privacy and Security Security enables privacy

Data is only as safe as the system its on

Sometimes security at odds with privacy Eg. Security requires authentication, but

privacy is achieved through anonymity Eg. TSA pat down at the airport

Page 27: Privacy USC CSci499

Privacy and Privacy & Security How do we “safely” share private

data? Privacy and Inferred Information Privacy and Social Networks How do we design a system with privacy

in mind?

Page 28: Privacy USC CSci499

Why do we want to share? Share existing data sets:

Research Companies

Buy data from each other Check out each other’s assets before

merges/buyouts Start a new dataset:

Mutually beneficial relationships Share data with me and you can use this

service

Page 29: Privacy USC CSci499

Sharing everything? Easy, but what are the ramifications? Legal/policy may limit what can be

shared/collected IRBs: Institutional Review Board HITECH & HIPAA: Health Insurance

Portability and Accountability Act Future use and protection of data?

Page 30: Privacy USC CSci499

Mechanisms for limited sharing Remove really sensitive stuff

(sanitization) PPI & PII (private, personal & private

identifying) Without a crystal ball, this is hard

Anonymization Replace information to limit ability to tie

entities to meaningful identities Aggregation

Remove PII by only collecting/releasing statistics

Page 31: Privacy USC CSci499

Anonymization Example Network trace:

PAYLOAD

Page 32: Privacy USC CSci499

Anonymization Example Network trace:

PAYLOAD

All sorts of PII and PPI in there!

Page 33: Privacy USC CSci499

Anonymization Example Network trace:

PAYLOAD

Routing information: IP addresses, TCP flags/options, OS fingerprinting

Page 34: Privacy USC CSci499

Anonymization Example Network trace:

PAYLOAD

Remove IPs? Anonymize IPs?

Page 35: Privacy USC CSci499

Anonymization Example Network trace:

PAYLOAD

Removing IPs severely limits what you can do with the data.Replace with something identifying, but not the same data.

IP1 = AIP2 = B Etc.

Page 36: Privacy USC CSci499

Aggregation Example “Fewer U.S. Households Have Debt,

But Those Who Do Have More, Census Bureau Reports”

Page 37: Privacy USC CSci499

Methods can be bad or good Just because someone uses aggregation

or anonymization, doesn’t mean the data is safe

Example: Release aggregate stats of people’s favorite

color?

Page 38: Privacy USC CSci499

Privacy and Privacy & Security How do we “safely” share private data? Privacy and Inferred Information Privacy and Social Networks How do we design a system with privacy

in mind?

Page 39: Privacy USC CSci499

What is Inferred? Take 2 sources of information, correlate

data X + Y = …. Example: Google Street View + what my

car looks like + where I live = you know where I was back in November

Page 40: Privacy USC CSci499

Another example Paula Broadwell who had an affair with

CIA director David Petraeus, similarly took extensive precautions to hide her identity. She never logged in to her anonymous e-mail service from her home network. Instead, she used hotel and other public networks when she e-mailed him. The FBI correlated hotel registration data from several different hotels -- and hers was the common name.

Page 41: Privacy USC CSci499

Another example: Netflix & IMDB Netflix prize: released an anonymized

dataset Correlated with IMDB: undid

anonymization (University of Texas)

Page 42: Privacy USC CSci499

Privacy and Privacy & Security How do we “safely” share private data? Privacy and Inferred Information Privacy and Social Networks How do we design a system with privacy

in mind?

Page 43: Privacy USC CSci499

What is social networking data? Associations Not what you say, but who you talk to

OMG NEW BOYFRIEND

Page 44: Privacy USC CSci499

Why is social data interesting? From a privacy point of view:

Guilt by association Eg. Government very interested

Phone records (US) Facebook activity (Iran)

Page 45: Privacy USC CSci499

Computer Communication Computer communication = social network What sites/servers you visit/use = information

on your relationship with those sites/servers

Never mind the content…How often you visit and who you visit may reveal a lot!

You Unicornsareawesome.com

Page 46: Privacy USC CSci499

How do we provide privacy?

Of course encrypt content (payload)! But: Network/transport layer = no

encryption (for now)

Anyone along the path can see source and destination… so now what?

Page 47: Privacy USC CSci499

Onion Routing General idea: bounce connection

through a bunch of machines

Page 48: Privacy USC CSci499

Don’t we bounce around already?

Not actually what happens……

Page 49: Privacy USC CSci499

Don’t we bounce around already?

Closer to what actually happens.

Page 50: Privacy USC CSci499

Don’t we bounce around already?

Yes, we route packets through a series of routers

BUT this doesn’t protect the privacy of who’s talking to whom…

Why? PAYLOAD

Page 51: Privacy USC CSci499

Don’t we bounce around already?

Yes, we route packets through a series of routers

BUT this doesn’t protect the privacy of who’s talking to who…

Why?

Contains routing information.

ENCRYPTED

Page 52: Privacy USC CSci499

Yes, we bounce… but: Everyone along the way can see src &

dst Routes are easy to figure out

Contains routing information = Can’t encryptEveryone along the path (routers and observers) can see who is talking to whom

ENCRYPTED

Page 53: Privacy USC CSci499

Onion routing saves us Each router only knows about the

last/next hop Routes are hard to figure out

Change frequently Chosen by the source

Page 54: Privacy USC CSci499

The Onion part of Onion Routing

Layers of encryption

PAYLOAD

Last hop’s key

Second hop’s key

First hop’s key

Page 55: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

Page 56: Privacy USC CSci499

Onion Routing Example: Tor

YouTor directory

Get a list of Tor Routers from the publically known Tor directory

Tor Router IPs + public key for each router

Page 57: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

Tor Routers

Page 58: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.comChoose a set of Tor routers to use

1st

2nd

3rd

Page 59: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.comPackets are now encrypted with 3 keys

1st

2nd

3rd

Page 60: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

1st

2nd

3rd

Source: YOU, Dest: 1st Tor router

Page 61: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

1st

2nd

3rd

Decrypts 1st layer

Page 62: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

1st

2nd

3rd

Source: 1st Tor router, Dest: 2nd Tor router

Page 63: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

1st

2nd

3rd

Decrypts 2nd layer

Page 64: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

1st

2nd

3rdSource: 2nd Tor router, Dest: 3rd Tor router

Page 65: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

1st

2nd

3rdDecrypts last layer

Page 66: Privacy USC CSci499

Onion Routing Example: Tor

You

Unicornsareawesome.com

1st

2nd

3rd

Original (unencrypted) packet sent to server.

Source: 3rd Tor router, Dest: Unicornsareawesome.com

Page 67: Privacy USC CSci499

What does our attacker see?

Encrypted traffic from You, to 1st Tor router

You

Page 68: Privacy USC CSci499

What does our attacker see?

Other view points? Not easily traceable to you.

You

Page 69: Privacy USC CSci499

What does our attacker see?

Global view points? Very unlikely... But if so… trouble!

Page 70: Privacy USC CSci499

What does our attacker see?

Also unlikely… can perform correlation between end-to-end.

Page 71: Privacy USC CSci499

Reliance on multiple users

What would happen here if You were the only one using Tor?

You

Page 72: Privacy USC CSci499

Side note: Tor is an overlay

Tor routers are often just someone’s regular machine. Traffic is still routed over regular routers too.

Page 73: Privacy USC CSci499

Onion Routing: Things to Note Not perfect, but pretty nifty End host (unicornsareawesome.com)

does not need to know about the Tor protocol (good for wide usage and acceptance)

Data is encrypted all the way to the last Tor router If end-to-end application (like HTTPS) is

using encryption, the payload is doubly encrypted along the Tor route.

Page 74: Privacy USC CSci499

Privacy and Privacy & Security How do we “safely” share private data? Privacy and Inferred Information Privacy and Social Networks How do we design a system with privacy

in mind?

Page 75: Privacy USC CSci499

Designing privacy preserving systems

Aim for the minimum amount of information needed to achieve goals

Think through how info can be gained and inferred Inferred is often a gotcha! x + y =

something private, but x and y by themselves don’t seem all that special

Think through how information be gained On the wire? Stored in logs? At a router? At

an ISP?

Page 76: Privacy USC CSci499

Privacy and Stored Information Data is only as safe as the system How long is the data stored affects

privacy Longer term = bigger privacy risk (in

general) Longer time frame, more data to correlate

& infer Longer opportunity for data theft Increased chances of mistakes, lapsed

security etc.

Page 77: Privacy USC CSci499

An example of keeping privacy in mind

My work: P2P file sharing detection