![Page 1: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/1.jpg)
Research Principles Revealed
Jennifer Widom
Stanford University
![Page 2: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/2.jpg)
2
But First, Some Thanks
Four Extra-Special People
Superb Students
Terrific Collaborators
![Page 3: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/3.jpg)
3
Extra-Special #1
Laura Haas• Hired a PL/logic person with
minimal DB experience
• The Perfect Manager
– Mentored instead of managed– Ensured I could devote nearly all of my time to research– Sported a great button
![Page 4: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/4.jpg)
4
Extra-Special #2
Stefano Ceri• Incredible run of summer
collaborations (IBM and Stanford)
• Jennifer Stefano SuccessDetailsDetails IntuitionIntuition
![Page 5: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/5.jpg)
5
Extra-Special #3 and #4
Hector Garcia-Molina and Jeff Ullman• Colleagues, mentors,
book co-authors
• Neighbors, baby-sitters, sailing crew, kids sports photographers, …
{ Hector, Jeff, Jennifer }
• Research collaborations in all 23 subsets
![Page 6: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/6.jpg)
6
Superb Ph.D. Students
![Page 7: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/7.jpg)
7
Terrific Collaborators*
Serge AbiteboulBrian BabcockElena Baralis
Omar BenjellounSudarshan
ChawatheBobbie Cochrane
Shel FinkelsteinAlon Halevy
Rajeev Motwani Anand Rajaraman
Shuky SagivJanet Wiener
* Significant # co-authored papers in DBLP
![Page 8: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/8.jpg)
8
Now to the “Technical” Part …
![Page 9: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/9.jpg)
9
Research Principles Revealed
1. Topic Selection
2. The Research
3. Dissemination
DisclaimerThese principles work for me.
Your mileage may vary!
DisclaimerThese principles work for me.
Your mileage may vary!
![Page 10: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/10.jpg)
10
Major Research Areas
ActiveDatabases
DataWarehousing
SemistructuredData
“Lore”Data
Streams
Uncertaintyand Lineage
“Trio”
Constraints
TriggersIncremental
View Maintenance
IncrementalView
Maintenance
Triggers
IncrementalView
MaintenanceLineage
![Page 11: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/11.jpg)
11
IncrementalView
Maintenance
Major Research Areas
ActiveDatabases
DataWarehousing
SemistructuredData
“Lore”
Data Streams
Uncertaintyand Lineage
“Trio”
Constraints
Triggers
Lineage
IncrementalView
Maintenance
![Page 12: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/12.jpg)
12
Finding Research Areas
I’m not a visionary(In fact, I’m “anti-visionary”)
• Never know what my next area will be
• Some combination of “gut feeling” and luck
![Page 13: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/13.jpg)
13
Finding Research Areas
ActiveDatabases
DataWarehousing
SemistructuredData
Data Streams
Uncertaintyand Lineage
Data Integration
![Page 14: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/14.jpg)
14
Finding Research Areas
Uncertaintyand Lineage
![Page 15: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/15.jpg)
15
Finding Research Topics
One recipe for a successful database research project
• Pick a simple but fundamental assumption underlying traditional database systems
Drop it
• Must reconsider all aspects of data management and query processing
– Many Ph.D. theses– Prototype from scratch
![Page 16: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/16.jpg)
16
Example “simple but fundamental assumptions”• Schema declared in advance• Persistent data sets• Tuples contain values
Reconsidering “all aspects”• Data model• Query language• Storage and indexing structures• Query processing and optimization• Concurrency control, recovery• Application and user interfaces
Finding Research Topics
Semistructured data
Semistructured data
Data streams
Data streams
Uncertain dataUncertain data
![Page 17: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/17.jpg)
17
The Research Itself
Critical triple for any new kind of database system
• Do all of them• In this order• Cleanly and carefully (a research luxury) Solid foundations, then implementation
DataModel
QueryLanguage System
![Page 18: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/18.jpg)
18
Cleanly and carefully
Nailing Down a New Data Model
![Page 19: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/19.jpg)
19
Example: “A data stream is an unbounded sequence of [tuple timestamp] pairs”
Temperature Sensor 1: [(72) 2:05] [(75) 2:20] [(74) 2:21] [(74) 2:24] [(81) 2:45] …
Temperature Sensor 2: [(73) 2:03] [(76) 2:20] [(73) 2:22] [(75) 2:22] [(79) 2:40] …
Nailing Down a New Data Model
![Page 20: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/20.jpg)
20
Nailing Down a New Data Model
Example: “A data stream is an unbounded sequence of [tuple timestamp] pairs”
Temperature Sensor 1: [(72) 2:05] [(75) 2:20] [(74) 2:21] [(74) 2:24] [(81) 2:45] …
Temperature Sensor 2: [(73) 2:03] [(76) 2:20] [(73) 2:22] [(75) 2:22] [(79) 2:40]
…
Duplicate timestamps in streams? If yes, is order relevant?
![Page 21: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/21.jpg)
21
Nailing Down a New Data Model
Example: “A data stream is an unbounded sequence of [tuple timestamp] pairs”
Temperature Sensor 1: [(72) 2:05] [(75) 2:20] [(74) 2:21] [(74) 2:24] [(81) 2:45] …
Temperature Sensor 2: [(73) 2:03] [(76) 2:20] [(73) 2:22] [(75) 2:22] [(79) 2:40] …
Are timestamps coordinated across streams?
Duplicates? Order relevant?
![Page 22: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/22.jpg)
22
Nailing Down a New Data Model
Example: “A data stream is an unbounded sequence of [tuple timestamp] pairs”
Temperature Sensor 1: [(72) 2:05] [(75) 2:20] [(74) 2:21] [(74) 2:24] [(81) 2:45] …
Temperature Sensor 2: [(73) 2:03] [(76) 2:20] [(73) 2:22] [(75) 2:22] [(79) 2:40] …
Sample Query (continuous)
“Average discrepancy between sensors” Result depends heavily on model
![Page 23: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/23.jpg)
23
Data Model for Trio Project
Relative expressiveness
Closure properties
Only “complete” model
Only understandable models
In the end, lineagesaved the day
In the end, lineagesaved the day
Possible modelsR
![Page 24: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/24.jpg)
24
The Research Triple
DataModel
QueryLanguage System
QueryLanguage
![Page 25: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/25.jpg)
25
Query Language Design
Notoriously difficult to publish
But potential for huge long-term impact
Semantics can be surprisingly tricky
• Cleanly and carefully Solid foundations, then implementation
DataModel
QueryLanguage System
![Page 26: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/26.jpg)
26
The IBM-Almaden Years
Developing an active rule (trigger) system
“We finished our rule system ages ago”
Transition tables, Conflicts, Confluence, …
“Write Code!”
![Page 27: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/27.jpg)
27
The IBM-Almaden Years
Developing an active rule (trigger) system
“We finished our rule system ages ago”
“Yeah, but what does it do?”
![Page 28: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/28.jpg)
28
The IBM-Almaden Years
Developing an active rule (trigger) system
“Yeah, but what does it do?”
“Umm … I’ll need to run it to find out”
![Page 29: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/29.jpg)
29
The IBM-Almaden Years
Developing an active rule (trigger) system
“Umm … I’ll need to run it to find out”
DisclaimerThese principles work for me.
Your mileage may vary.
DisclaimerThese principles work for me.
Your mileage may vary.
![Page 30: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/30.jpg)
30
Tricky Semantics Example #1
Semistructured data (warm-up)
Query: SELECT Student WHERE Advisor=‘Widom’
<Student> <ID> 123 </ID> <Name> Susan </Name> <Major> CS </Major></Student><Student> ● ● ● </Student>
• Error?• Empty result?• Warning?
![Page 31: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/31.jpg)
31
Tricky Semantics Example #1
Semistructured data (warm-up)
Query: SELECT Student WHERE Advisor=‘Widom’
<Student> <ID> 123 </ID> <Name> Susan </Name> <Major> CS </Major></Student><Student> ● ● ● </Student>
Lore• Empty result• Warning
![Page 32: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/32.jpg)
32
Tricky Semantics Example #1
Semistructured data (warm-up)
Query: SELECT Student WHERE Advisor=‘Widom’
<Student> <ID> 123 </ID> <Advisor> Garcia </Advisor> <Advisor> Widom </Advisor></Student><Student> ● ● ● </Student>
Lore Implicit ∃
![Page 33: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/33.jpg)
33
Tricky Semantics Example #2
Trigger 1: WHEN X makes sale > 500 THEN increase X’s salary by 1000
Trigger 2: WHEN average salary increases > 10% THEN increase everyone’s salary by
500
Inserts: Sale(Mary,600) Sale(Mary,800) Sale(Mary,550)
• How many increases for Mary?
• If each causes average > 10%, how many global raises?
• What if global raise causes average > 10%?
![Page 34: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/34.jpg)
34
Tricky Semantics Example #3
Temperature Sensor: [(72) 2:00] [(74) 2:00] [(76) 2:00] [(60) 8:00] [(58) 8:00] [(56)
8:00]
Query (continuous): Average of most recent three readings
![Page 35: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/35.jpg)
35
Tricky Semantics Example #3
Temperature Sensor: [(72) 2:00] [(74) 2:00] [(76) 2:00] [(60) 8:00] [(58) 8:00] [(56)
8:00]
Query (continuous): Average of most recent three readings
System A: 74, 58
![Page 36: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/36.jpg)
36
Tricky Semantics Example #3
Temperature Sensor: [(72) 2:00] [(74) 2:00] [(76) 2:00] [(60) 8:00] [(58) 8:00] [(56)
8:00]
Query (continuous): Average of most recent three readings
System A: 74, 58
System B: 74, 70, 64.7, 58
![Page 37: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/37.jpg)
37
Tables: Sigmod(year,loc,…) Climate(loc,temp,…)
Query: Temperature at SIGMOD 2010
The “It’s Just SQL” Trap
Sigmod (year, loc)
2010 London ∥ New York
Climate (loc, temp)
London [ 55 – 68 ]
New York [ 64 – 79 ]
SELECT S.tempFROM Sigmod S, Climate CWHERE S.loc = C.loc AND S.year = 2010
SELECT S.tempFROM Sigmod S, Climate CWHERE S.loc = C.loc AND S.year = 2010
![Page 38: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/38.jpg)
38
The “It’s Just SQL” Trap
• Syntax is one thing (actually it’s nothing)• Semantics is another, as we’ve seen
― Semistructured
― Continuous
― Uncertain
― <Insert future new model here>
![Page 39: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/39.jpg)
39
ResultResult
Taming the Semantic Trickiness Reuse existing (relational) semantics
whenever possible
Uncertain data — semantics of query QDD
D1, D2, …, DnD1, D2, …, Dn
possibleinstances
Q on eachinstance
representationof instances
Q(D1), Q(D2), …, Q(Dn)Q(D1), Q(D2), …, Q(Dn)
(implementation)
30 years of refinement30 years of refinement
![Page 40: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/40.jpg)
40
Taming the Semantic Trickiness Reuse existing (relational) semantics
whenever possible
Semantics of stream queries
Streams Relations
Window
Istream / Dstream30 years of refinement30 years of refinement
Relational
![Page 41: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/41.jpg)
41
Taming the Semantic Trickiness Reuse existing (relational) semantics
whenever possible
• Active databases: “transition tables”
• Lore: semantics based on OQL3 years of refinement3 years of refinement
![Page 42: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/42.jpg)
42
System
The Research Triple
DataModel
QueryLanguage System
Impact
“Write Code!”
![Page 43: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/43.jpg)
43
Truth in Advertising
SystemData
ModelQuery
Language System
• As research evolves, always revisit all three• Cleanly and carefully!
![Page 44: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/44.jpg)
44
Disseminating Research Results If it’s important, don’t wait
• No place for secrecy (or laziness) in research• Every place for being first with new idea or
result
• Post on Web, inflict on friends
• SIGMOD/VLDB conferences are not the only place for important work Send to workshops, SIGMOD Record, …
• Make software available and easy to useDecent interfaces, run-able over web
![Page 45: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/45.jpg)
45
Summary: Five Points
1 Don’t dismiss the types (intuition visionary)
And don’t forget the
2 Data Model + Query Language + System Solid foundations, then implementation
3 QL semantics: surprisingly tricky Reuse existing (relational) semantics whenever possible
IntuitionIntuition
DetailsDetails
![Page 46: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/46.jpg)
46
Summary: Five Points
4 Don’t be secretive or lazy Disseminate ideas, papers, and software
5 If all else fails, try stirring in the key ingredient:
IncrementalView
Maintenance
![Page 47: Research Principles Revealed Jennifer Widom Stanford University](https://reader036.vdocuments.site/reader036/viewer/2022062305/56649c925503460f9494e2b2/html5/thumbnails/47.jpg)
47
Thank You
Serge AbiteboulBrian BabcockElena Baralis
Omar BenjellounSudarshan Chawathe
Bobbie Cochrane
Shel FinkelsteinAlon Halevy
Rajeev Motwani Anand Rajaraman
Shuky SagivJanet Wiener