the future of information chris pal assistant professor, computer science university of rochester

24
The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Upload: phillip-fisher

Post on 12-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

The Future of Information

Chris Pal

Assistant Professor, Computer Science University of Rochester

Page 2: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

What Comes to Your Mind?

For the words• Picture• Book• Library• Newspaper• Radio• Television• Telephone• Computer

Page 3: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

What Comes to Your Mind?

Now, let’s consider some recent developments…

For the words• Picture• Book• Library• Newspaper• Radio• Television• Telephone• Computer

Page 4: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Electronic Storage of All Human Knowledge is Within Reach

The Internet Archive - Brewster Kahle

• The Wayback Machine • Archive of the Internet from 1996-Present• Size, 2 petabytes of data• Currently growing at 20 terabytes per month.• ‘Eclipses the amount of text contained in the

world's largest libraries, including the Library of Congress’

Page 5: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

How is this Possible?

Storage Technology• 1 Terabyte of hard disk, approx. $500• A Petabyte -

on the order of $1million (8 racks)

Digitization Efforts• Books: 10 cents/page, about $30/book

150,000 books • Audio: $10/hour for archival

100,000 items• Video: $15/hour to digitize

50,000 videos120 TB Rack

Page 6: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Also, Consider thatMemory for Small Devices

• Has reached a critical point for many applications – importantly it is read/write

• Smaller and less expensive each year

$34 Retail $18 Retail

Page 7: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Concrete Examples

• kilo 103

• mega 106

• giga 109

• tera 1012

• peta 1015

• exa 1018

• zeta 1021

• yotta 1024

• 1 song as an MP3, 5 MB• 400 songs on 2GB Card• 200,000 songs on a PC• 200 Million songs in a room‘We’ are here now

Page 8: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Physical Transportation of Information can be Effective

• Data over radio is also being used in Mali• Locally processed and re-distributed via broadcast radio• Local access via local wireless or flash cards also possible• Similar to a North American video store

From: S. Keshav, U. Waterloo

Page 9: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

What Should We Store First?

• There are still many choices1999 estimate of the world’s production of storable information: 1.5 exabytes

• Smallholders - Agricultural InformationExamples: Core Historical Literature of Agriculture (CHLA)

Page 10: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Digitization and Information Extraction

Page 11: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Information Extraction

• Allows us to create structured databases from unstructured text (e.g. monster.com)

ID Crop/Animal

Location Issue Remedial Measures

130 Sweet Corn

Long Island

Disease Resistant strains

129 Wheat Monroe County

Insect Pesticide A

128 Dairy Cow

Ithaca Milk Yield

etc.

• From the database we can: (1) enable better indexing and search(2) generate user tailored summaries & digests

Page 12: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Ways to Impact Small Landholders

• Create archival information sources- Digitize existing general knowledge and past experimental information & extract DB records - Obtain and include local information sources- Create image to text for local languages

• Mediate the flow of current information- New technologies: seeds, fertilizers, etc.- Alerts about diseases - Market information, access to inputs, capital

• Filter Information, Process and Distribute- How do we go from raw information to the small landholder?- Challenges: literacy, infrastructure

Page 13: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Paper, Subscriptions & Customization

• Traditional paper formats are still powerful (e.g. Classical Newsletters, BMPs, Spore, etc…)

• We can learn from magazine subscription models - market based implicit sustainability

• Information processing allows us to create user customize digests in both electronic (e.g. text, audio) and paper formats, a custom ‘newspaper’

• User customized search and feedback are active areas of research

• What if information could search for you?

Page 14: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Broadcast Radio

• Radio is comparatively low cost for information delivery to non-literate people

• Already used effectively for education in Africa, e.g. Education and Development Center in Africa reaches 80,000 children

• Can effectively reach women• Low power receivers:

solar power and hand cranked generators available for many years

Page 15: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Radio Today and Tomorrow

1. Radio receivers can now be fully integrated into small, low power recording devices and cell phones

Allows for time shifting of broadcasts

2. New technologies for data broadcast using radio allows large areas to be covered with low infrastructure costs

Market information, weather, local information, audio metadata for indexing

Page 16: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Indexing and Organizing Multimedia

Page 17: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Feedback Mechanisms

• Online retail sites have already deployed techniques for rating products and media

• Spectrum of feedback:- Simple numerical rating- Detailed product reviews- Complete online discussions and debates

• ‘Easy’ implement extensions of these ideas

• What about interactivity with low bandwidth communication, such as text messaging

Page 18: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Question Answering Forum

From Krithi R., IIT Bombay. Thousands of posts, serving all of India.

Both a web interface and cell phone based SMS interaction.

Page 19: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Leveraging Q&A Databases

• Consider Google 411 or Microsoft’s variant (demo)• Here, we can create methods to identify if an

answer already exists in the database• Given the archive, we can hire people to translate

questions and answers into local languages• We can then use this corpus as an excellent test

bed to develop automated translation techniques• Speech recognition and synthesis techniques

could be developed / tailored for these scenarios• Resulting technology could then be applied to

augment other information sources, e.g.

Page 20: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Community Generated Content

• Associate information with maps, by hand or with extraction• We could easily fit the text of an agricultural Wikipedia /

Agpedia / WikiGIS on a flash memory card• Distribute information formatted for cell phones, • Use text to speech to give access to non-literate users

WikiMedia vs. Wikipedia

Page 21: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

But, What is a Television?

7” Digital ‘Picture Frame’ $60 $40 Cell Phone

[All you need to do to create a computer for the developing world is to connect a phone to a television] – C. Pal, C. Mundie and B. Gates :)

Page 22: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

What is Important

• Devices with no moving parts

• Low power consumption

• For LCDs – better text fidelity

We can think about

• Television programs as files (100s MB), radio programs as files and

• New, inexpensive low power chips process these files

Page 23: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Implications and Ideas

• Parts of the developing world may skip the era of CRTs and broadcast TVs.

• Interactive radio may be a concrete first step.

• 5 year horizon: multimedia, wikimedia like agricultural portals tailored new device formats.

Page 24: The Future of Information Chris Pal Assistant Professor, Computer Science University of Rochester

Highlights: How Technology Can Enhance Information Value ‘Chains’

• Read/write data storage now very inexpensive• Digitization and modern information extraction can

help organize information on a massive scale• Language technologies: translation, speech

recognition, speech synthesis – almost mature• Support the construction of high quality print,

radio, and multimedia productions by giving communicators greater access to information

• Low cost ‘personal’ and shared devices can be used to interact with structured multimedia

• Immediate, medium and long term solutions.

the Ecosystem