1 next generation cybertools: social science research using web data a project of cornell...

6
1 Next Generation Cybertools: Social Science Research using Web Data A project of Cornell University and the Internet Archive, Funded by the National Science Foundation

Upload: abel-james

Post on 20-Jan-2018

212 views

Category:

Documents


0 download

DESCRIPTION

The Internet Archive Complete crawls of the Web, every two months since 1996 Range of formats and depth of crawl have increased with time. No protected sites. Total archive is about 600 TByte Rate of increase is about 1 TByte/day

TRANSCRIPT

Page 1: 1 Next Generation Cybertools: Social Science Research using Web Data A project of Cornell University…

1

Next Generation Cybertools:Social Science Research using Web Data

A project of Cornell University and the Internet Archive,

Funded by the National Science Foundation

Page 2: 1 Next Generation Cybertools: Social Science Research using Web Data A project of Cornell University…

2

The NSF Cybertools Grant

Sociology: Michael Macy (Principal Investigator), David Strang

Computing and Information Science: Bill Arms, Dan Huttenlocher, Jon Kleinberg

Very Large Semi-Structured Datasets for Social Science Research

"Computer scientists have learned through experience that it is usually best to build software tools in close collaboration with users. Hence, our proposal is two-fold – to build an intelligent front-end that will make the Internet Archive data broadly accessible to social scientists, and to develop, test, and refine these tools through a specific research application – the diffusion of innovation."

Begins January 2006

Page 3: 1 Next Generation Cybertools: Social Science Research using Web Data A project of Cornell University…

The Internet Archive• Complete crawls of the

Web, every two months since 1996

• Range of formats and depth of crawl have increased with time.

• No protected sites.

• Total archive is about 600 TByte

• Rate of increase is about 1 TByte/day

Page 4: 1 Next Generation Cybertools: Social Science Research using Web Data A project of Cornell University…

4

New Opportunities forSocial Science Research

The Web as a social phenomenon

Political campaigns

Online retailing

Self-publication (blogs)

The Web as evidence

The spread of urban legends ("Einstein failed mathematics")

Diffusion of innovation (e.g. free hotel wireless internet)

Polarization of opinion

Page 5: 1 Next Generation Cybertools: Social Science Research using Web Data A project of Cornell University…

5

Using the Web Laboratory

Using the laboratory

If you would like to use the Web Laboratory for your research or teaching, please contact me. The order in which we build the services will be decided by the demands of the users.

Page 6: 1 Next Generation Cybertools: Social Science Research using Web Data A project of Cornell University…

6

Thanks

This work would not be possible without the forethought and longstanding commitment of the Internet Archive to capture and preserve the content of the Web for future generations.

The work is funded in part by National Science Foundation grant 0403340, with equipment support from Unisys.

The Cornell Theory Center's support for this project is funded in part by Microsoft, Dell and Intel.