1 mylifebits: realizing the memex vision santa clara university 13 may 2004 gordon bell, jim gemmell...
Post on 15-Jan-2016
215 views
TRANSCRIPT
11
MyLifeBits: Realizing the Memex Vision
Santa Clara University13 May 2004
Gordon Bell, Gordon Bell, Jim Gemmell & Roger Lueder Jim Gemmell & Roger Lueder
www.MyLifeBits.com www.MyLifeBits.com
www.research.microsoft.com/~gbell www.research.microsoft.com/~gbell
22
Mylifebits collage
33
Outline … MyLifeBits
Background…fulfilling the Memex visionBackground…fulfilling the Memex visionCyberizing everythingCyberizing everythingFile to database transitionFile to database transitionUse…beyond searchUse…beyond searchWorking with Media Center for home useWorking with Media Center for home useLong-term agenda and outlookLong-term agenda and outlook
Archiving persons and things.Archiving persons and things.
44
MemexAs We May Think, Vannevar Bush, 1945
““A memex is a device in which an individual stores all A memex is a device in which an individual stores all his books, records, and communications, and which his books, records, and communications, and which is mechanized so that it may be consulted with is mechanized so that it may be consulted with exceeding speed and flexibility”exceeding speed and flexibility”
Full-text search, text & audio annotations, and Full-text search, text & audio annotations, and hyperlinkshyperlinks
55
Capturing what you see
66
I am data
77
The guinea pig
Gordon Bell is digitizing his lifeGordon Bell is digitizing his life Has now scanned virtually all:Has now scanned virtually all:
Books written (and read when possible)Books written (and read when possible) Personal documents (correspondence including memos and email, Personal documents (correspondence including memos and email,
bills, legal documents, papers written, …)bills, legal documents, papers written, …) PhotosPhotos Posters, paintings, photo of things (artifacts, …medals, plaques)Posters, paintings, photo of things (artifacts, …medals, plaques) Home movies and videosHome movies and videos CD collectionCD collection And, of course, all PC filesAnd, of course, all PC files
Now recording: phone, radio, TV (movies), web pages… Now recording: phone, radio, TV (movies), web pages… conversations and meetings to comeconversations and meetings to come
Paperless throughout 2002. 12” scanned, 12’ discardedPaperless throughout 2002. 12” scanned, 12’ discarded.. Only 30 GB!!!Only 30 GB!!!
88
Capture and encoding
99
Quindi conference capture
1010
I mean everything
1111
Wearable & interactive jewellery LEDs flash Wearable & interactive jewellery LEDs flash according to sensor type triggeredaccording to sensor type triggered
1212
Potentially useful trivia – but Potentially useful trivia – but not normally photographednot normally photographed
1313
GPS: tells where and whenGPS: tells where and when
1414
Kentaro Kentaro Toyama Toyama wwmx.orgwwmx.org
1515
gbell wag: 67 yr, 25Kday life
1
1 0
1 0 0
1 , 0 0 0
1 0 , 0 0 0
1 0 0 , 0 0 0
1 , 0 0 0 , 0 0 0
1 0 0 -5 K B
M s g s
1 0 0 -5 0 K B
p a g e s
5 -1 0 0 K B
T i f s
0 . 1 -1 M B
B o o k s
1 0 -4 0 0 K Bj p e g s
4 0 K s1 K B p ss o u n d
0 . 1 -1 0 0 M Bs o n g s
1 -1 0 G B
V i d e o s
L i f e t i m e s t o r a g e ( G B )
1616
MyLifeBits organization: time and space
Timeline/Context(space)
Personal(some $s)
GB Co.(angel, etc.)
ProfessionalACM, etc., …@Microsoft.com,
New co’s.
Archival (time) Working
1717
MyLifeBits: Some Lives(t) PersonalPersonal
Parents, children, grandkidsParents, children, grandkids CGB himselfCGB himself GKBGKB Close friendsClose friends
GB $sGB $s Personal incl. several legal Personal incl. several legal
structuresstructures Properties: autos, real estate,Properties: autos, real estate, Investments & contractsInvestments & contracts
Past prof. companies/organiz’nsPast prof. companies/organiz’ns DECDEC Carnegie-Mellon U.Carnegie-Mellon U. DEC, NSF, Encore, Ardent, DEC, NSF, Encore, Ardent,
Me Inc., Me Inc.,
CGB@ MicrosoftCGB@ Microsoft MLBMLB ClustersClusters TelepresenceTelepresence WWW presenceWWW presence
Computer History MuseumComputer History Museum BOD memberBOD member Fund-raisingFund-raising CyberMuseumCyberMuseum
Startups & boardsStartups & boards Bell-Mason DirectorBell-Mason Director Diamond & Vanguard Brds.Diamond & Vanguard Brds.
1818
Bell Lives timeline
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
C,L m d d
CGB... GB SR mB,L KF SB
Where KvMO B ABosP B WCa
6-year --GS-HS---MIT DEC---+++++.+++---++++
Education KV-----mit,F cmu
Work Bell Elec DECcmuDEC E,NSF MSFT
ComputerMuseum M B SiValley
Books BN SBN HiTechVent
Computers 4-6 11 VAX E A
19
Personal LifeLog Applications
Conservator
Baby Book
Companion Caretaker
Babysitter
AdvisorMentorTutor
Autobiography
Photo Album
PersonalAssistant
Diary/Journal
Biography
Financial Manager
Medical Manager
ExecutorObituary
Others Self
Assistant for Elderly
Application controlled by:
Oth
ers
Sel
fA
pp
lica
tio
n u
sed
by:
Personal Proxy
Parole Officer
Pers Flight Recorder
Meeting Prep
Captain’s Log
Trustee
2020
MyLifeBits Software
MyLifeBits store
database
Voice Voice annotation annotation tooltool
Text Text annotation annotation tooltool
Telephone Telephone capture toolcapture tool
TV capture TV capture tooltool
TV EPG TV EPG download download tooltool
Radio Radio capture toolcapture tool
Radio EPG Radio EPG tooltool
PocketPC PocketPC transfer transfer tooltool
PocketRadio PocketRadio playerplayer
Import filesImport files
MyLifeBits MyLifeBits ShellShell
files
Legacy Legacy applicationsapplications
Browser Browser tooltool
InternetInternet
IM captureIM capture
MAPI MAPI interfaceinterface
Legacy Legacy email clientemail client
2121
MyLifeBits is:
Memex and more (audio and video)Memex and more (audio and video) Universal store for all personal stuffUniversal store for all personal stuff Guiding principles for the system:Guiding principles for the system:
1.1. Full text search & Full text search & collectionscollections (> than hierarchy) (> than hierarchy)
2.2. Visualizations for search, display, insightVisualizations for search, display, insight
3.3. Annotations and links add value and essentialAnnotations and links add value and essential Increase search ability and value of information.Increase search ability and value of information. So make many kinds and them easy to create!So make many kinds and them easy to create! Stories are the ultimate annotationStories are the ultimate annotation
4.4. Keep the links when you author: “transclusion”Keep the links when you author: “transclusion”
2222
MLB database: size and content?
Database features are essential: Database features are essential: Consistency, Indexing, Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication.Pivoting, Queries, Speed/scalability, Backup, replication.
Folders &Files were the starting point >> database into sets Folders &Files were the starting point >> database into sets aka “collections” that are identical to the folder structureaka “collections” that are identical to the folder structure
Outlook (msgs, attachments, calendar, contacts)Outlook (msgs, attachments, calendar, contacts) Web trails including voice message annotation Web trails including voice message annotation Journal (Outlook), trails: every document use & transactionJournal (Outlook), trails: every document use & transaction What about? What about?
Money (transactions, payees, etc.)…is their lifelog/trailMoney (transactions, payees, etc.)…is their lifelog/trail Streets and trips to cross-index to all docsStreets and trips to cross-index to all docs Attributes for photos for retrieval? Location, time, settingsAttributes for photos for retrieval? Location, time, settings Presentations as a report or trail. Each slide an object!Presentations as a report or trail. Each slide an object!
2323
Why bother? An existence proof.The following exist in abundance:
Shoeboxes full of photosShoeboxes full of photos Photo albums & framed photosPhoto albums & framed photos
Creative Memories is a thriving business selling resources for Creative Memories is a thriving business selling resources for created high-end photo albums that are well laid out and highly created high-end photo albums that are well laid out and highly annotated, using long-lasting materials.annotated, using long-lasting materials.
Home videosHome videos Bookshelves and filing cabinetsBookshelves and filing cabinets Old bundles of lettersOld bundles of letters Professional video/photo companies do capture at kids’ Professional video/photo companies do capture at kids’
sports events and sell content like hotcakessports events and sell content like hotcakes Probably not accessed very often but Probably not accessed very often but TREASURED TREASURED
(what’s the one thing you would save in a fire?)(what’s the one thing you would save in a fire?)
2424
Why bother? ..more reasons To eliminate physical storage (paper, CDs…)To eliminate physical storage (paper, CDs…) It costs more (in time) to delete than the cost the It costs more (in time) to delete than the cost the
storagestorage You may only want to retrieve one of many items You may only want to retrieve one of many items
in the future, but cannot predict which one in the future, but cannot predict which one (which is why you file many things now)(which is why you file many things now)
For posterity and nostalgiaFor posterity and nostalgia For memory enhancement & faster searchFor memory enhancement & faster search
(search your LifeBits rather than the web … a single (search your LifeBits rather than the web … a single source to look for anything you have ever seen)source to look for anything you have ever seen)
Let content analysis and data mining discover Let content analysis and data mining discover trends and correlations in your lifetrends and correlations in your life
Extensible XML schemasLogical viewsProgrammatic relationshipsSynchronization serviceInformation agents
Extensible XML schemasLogical viewsProgrammatic relationshipsSynchronization serviceInformation agents
application specific datasyste
m
people
application specific data
user
application specific data application
specific data
infrastructure
application specific data
2626
Annotation like this…
VoiceAnnotation
2828
Pivot to look at all of MLB(t)Call, contact, pivot by time to find web page
2929
Find brig, image, and look for 80
3030
Here are the photos
3131
Timeline view tells a story
3232
Interface to xls
3333
Statistics of use
3535
Value of media depends on annotations
““Its just bits until it is annotated”Its just bits until it is annotated”
3636
Getting the user to tell a story is the ultimate in media value
A story is a “layout” in time and spaceA story is a “layout” in time and space Most valuable content (by selection, and by being well annotated)Most valuable content (by selection, and by being well annotated) Stories must include links to any media they use (for future navigation/search – Stories must include links to any media they use (for future navigation/search –
“transclusion”).“transclusion”). Cf: MovieMaker; Creative Memories PhotoAlbumsCf: MovieMaker; Creative Memories PhotoAlbums
Dapeng was an Dapeng was an intern at BARC intern at BARC for the summer for the summer of 2000of 2000
We took him to We took him to lunch at our lunch at our favorite Dim Sum favorite Dim Sum place to say place to say farewellfarewell
At table L-R: Dapeng, Gordon, Tom, Jim, Don, At table L-R: Dapeng, Gordon, Tom, Jim, Don, Vicky, Patrick, JimVicky, Patrick, Jim
3737
Value of media depends on annotations
Auto-annotate whenever Auto-annotate whenever possible e.g. GPS cameraspossible e.g. GPS cameras
Make manual annotation Make manual annotation as easy as possible. XP as easy as possible. XP photo capture, voice, photo capture, voice, photos with voice, etcphotos with voice, etc
Support gang annotationSupport gang annotation Make stories easyMake stories easy
no
ne
au
to
au
to-u
sag
e
use
r-ba
sic
use
r-story
Annotations
“Its just bits until it is annotated”
3838
Future work: Visualizations Don't give me a little card Don't give me a little card
image and say, "That's all image and say, "That's all you've got, because that's you've got, because that's what I thought you should what I thought you should want for your virtual want for your virtual shoebox." There have got shoebox." There have got to be multiple modalities to be multiple modalities and the designers have to and the designers have to be able to deal with that. be able to deal with that. … don't metaphor me in, … don't metaphor me in, don't give me only one don't give me only one way of looking at things.way of looking at things.
-Andy van Dam, Hypertext '87 Keynote -Andy van Dam, Hypertext '87 Keynote AddressAddress
Next MediaNext Media
Web ScoutWeb Scout
U. MarylandU. Maryland IN-SPIRE
39
LifeLines (Plaisant et al.) LifeLines (Plaisant et al.) www.cs.umd.edu/hcil/lifelines www.cs.umd.edu/hcil/lifelines
University of Maryland
4040
Rethinking collections & files
Date collections (“summer 99”)Date collections (“summer 99”)Much better as a queryMuch better as a query
By Person (“Photos of Bill”)By Person (“Photos of Bill”)Better as links of type “photo of” to person Better as links of type “photo of” to person
“Bill”“Bill”By Event (“Trip to UCLA”)By Event (“Trip to UCLA”)
Better as links to event in calendarBetter as links to event in calendarWorking setWorking set
Better as query that figures it out for me so I Better as query that figures it out for me so I don’t need to maintain itdon’t need to maintain it
41
Facets and people• Time (& stage of life). Events… • Location (lat/long vs home, vacation)• Institution (relations including family, work, clubs,…)• Role (student, professional, parent, owner, etc.)
• Content type– Audio, graphics, photo, video aka moving picture– Document t type o(200) plus profession specific
ad, bill…will, cards (calling, credit, grade, greeting), certificate (birth…death), correspondence, diary, essay, forms, legal (6), instructions, lists, resume, reservation, scrapbook, transcript,
• Dissemination – Book, electronic, serial, unpublished,
• Special collections (e.g. geology, stamps, species, places)
42
Facet Lists
43
Certificate facets
44
“By region” and “by time” should be facets!
4545
Telephone, Television, and Radio in the
Home of the Future
4646
Evolution of media in the homeEvolution of media in the homeYesterday:Yesterday: Today:Today: Tomorrow:Tomorrow:
Analog storage Analog storage and transmission and transmission on separate on separate networksnetworks
Physical space Physical space limitationslimitations
Tedious Tedious management management and manual and manual searchsearch
Digital storage (CDs, DVDs, Digital storage (CDs, DVDs, PVRs, MPEG & WMA/V)PVRs, MPEG & WMA/V)
Digital cable, internet radio, Digital cable, internet radio, but phone is mostly analogbut phone is mostly analog
Still limitations on what we Still limitations on what we can storecan store
Different stores for different Different stores for different stuffstuff
All digitalAll digital Everything Everything
connectedconnected Unlimited storageUnlimited storage Everything in a Everything in a
databasedatabase
SQLSQL
4747
CD
VCR
Cassette
Plasma Panel
DVD
MediaCenter
Computer
Set top
Set top
Kbd Mse
Wfr
Spkr
SpkrIR
Cable/Satellite
Ethernet SVHS-wide
5.1 digital
5 speakers
stereo
stereo
stereoVideo*
5.1 digitalcomp.
stereoVideo*
Video*
Cables/linksSpeaker 5+1Plasma 2 or 3Cable/Enet 2IR 8Stereo 45.1 digital 2Comp./S-video 3Plasma panel 1Power 10Kbd/mse 2Monitor II (opt.) 4Camera 2Total 42 – 46 Things 18+remotes
*Video = composite or S-videoCamera
Mic
Receiver
Legacy
Legacy
Legacy
Redundant
4848
5050
The Agenda for the Tbyte(s), Lifetime, PC:The killer app after office and mail.
1.1. Guarantee that data will live forever! “dear appy” problemGuarantee that data will live forever! “dear appy” problem2.2. Cheap, easy, and data-rich (e.g. time, place) capture:Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhereGPS and time everywherePaper capture has to be as easy as discarding (scanner/shredder)Paper capture has to be as easy as discarding (scanner/shredder)Personal meeting capture...Personal meeting capture...E-book…e-magazines & journals need to have critical mass! E-book…e-magazines & journals need to have critical mass! Telephony and audio capture with indexingTelephony and audio capture with indexingMedia Center compatible for entertainment (photos, video, TV, radio)Media Center compatible for entertainment (photos, video, TV, radio)
3.3. Content analysis (critical for photo & video!)Content analysis (critical for photo & video!)4.4. Information control: privacy, security, expunge/deniability,… Information control: privacy, security, expunge/deniability,… 5.5. Having to be schizophrenic or have a lobotomy when leaving a “life”Having to be schizophrenic or have a lobotomy when leaving a “life”6.6. One One dbase for everything (articles, books, conversations, ... financial dbase for everything (articles, books, conversations, ... financial
transactions) …vs. long-term use of hierarchical files. transactions) …vs. long-term use of hierarchical files. Is dbase intuitive?Is dbase intuitive?7.7. Annotations/meta-information add every-increasing valueAnnotations/meta-information add every-increasing value
Easy annotation for aiding search and Easy annotation for aiding search and it becomes the contentit becomes the content8.8. The “killer apps”: Alzheimer, immortality, surrogate memory?The “killer apps”: Alzheimer, immortality, surrogate memory?9.9. GUI’s to improve use (e.g. time to learn, use, retention)GUI’s to improve use (e.g. time to learn, use, retention)
5151
The “dear appy” problemDear Appy, Dear Appy, How committed are you? How committed are you?
Please come back to me, Please come back to me, Lost and forgotten dataLost and forgotten data
Who’s responsible?Who’s responsible?mediamediaplatform, file, and databasesplatform, file, and databasesevolving standards and formatsevolving standards and formatsevolving and/or disappearing appsevolving and/or disappearing apps
5252
Problems: “Amnesia” control & deleting corporate “life” bits
Full sharing of bits that are mineFull sharing of bits that are mine I created them, OK to copy and distributeI created them, OK to copy and distributeDRM: purchased for my own useDRM: purchased for my own use
““OK to look at, but I only own half the bits”OK to look at, but I only own half the bits”Controlling forgetfulnessControlling forgetfulness
Private, do not “demo”Private, do not “demo”Expunge forever... “this never happened”Expunge forever... “this never happened”The bits “belong” to a corporation or org.The bits “belong” to a corporation or org.
5353
The Content Analysis Problem
1.1. ““Cliplets”: Automatic segmentation of a Cliplets”: Automatic segmentation of a pile of documents and video into pile of documents and video into individual documents and scenes.individual documents and scenes.
2.2. Item typing: Would like a minimal Dublin Item typing: Would like a minimal Dublin Core for each item: date, creator, title, Core for each item: date, creator, title, source, abstract, and typesource, abstract, and type
3.3. ““Type” classification: articles, letters, Type” classification: articles, letters, memos, etc.memos, etc.
4.4. Ontology creation for collectionsOntology creation for collections
5454
The End
55
Archiving persons and things…
• www.oac.cdlib.org for 0(1K) corporations, people, places, things. – List of finders, usually -> paper boxes!– E.g. Apple collection at Stanford points to 600’ or say $1K/ft.
• www.AlbertEinstein.org Einstein’s papers, etc.
• diva.library.cmu.edu/Newell/ for Allen Newell• profiles.nlm.nih.gov/ Nobel Prize winners, Lederberg • www.ComputerHistory.org computing artifacts• www.MyLifeBits.com project to capture entire life
56
List of finding aids
57
Apple at Stanford
58
www.alberteinstein.info
59
Allen Newell page
60
Lederberg
61
Computer History Museum
• 1401 Shoreline, Mountain View
62
Archiving computing artifacts• Charles Babbage Institute …Smithsonian is similar
– 135 collections 8K cu.ft. (20 M pages; 2 TB)– 160 oral histories (30MB/hr =6000 MB) – 150 K photos (@1MB, 150 GB)
• Computer history Museum– 6 K physical objects: world’s best artifact collection– 10 K photos– 2 K videos (<1 TB); including recent DV taped interviews– 12 M pages books, manuals, brochures, papers, (1.2 TB)– ?? Of executable source & object codes– 200 volunteers & many more world-wide
Amateurs versus professionals.
63
Computer History MuseumArtifact Collecting… the world is bits• Artifact (“the machine”)
– Dormant or operating– Hardware or software
• Project, people, plan– Timeline of project– Plan, schedule– Specification, manuals– Design– Organization– Communication– Articles, books– Interviews, talks, etc.
• Business aspects– Plan, sales, marketing– Ads, brochures, etc.– Competitors
• Use– User experience– Video about it’s use
• Accessibility – Raw bits, finding aid– Interpreted story– Exhibit
64
ChM Software Acquisition