new direction for tpc by michael stonebrakerso what is the best route forward?so what is the best...
TRANSCRIPT
![Page 1: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/1.jpg)
New Direction for TPC
by
Michael Stonebraker
![Page 2: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/2.jpg)
OutlineOutline
1985
1985-88
PAFS
TPC-H
The future
![Page 3: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/3.jpg)
19851985
Jim Gray writes debit-credit benchmark
And gets his friends to be co-authors
Commercial systems do about 25 TPS
Obviously inadequate
Jim Gray starts HPTS
Goal is 1000 TPS (x40)
![Page 4: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/4.jpg)
19851985--8888
Lots of ideas generated on improving OLTP
performance
Facilitated by HPTS
Lots of apples-to-oranges debit-credit
benchmarks
With conventional vendor marketing spin
But performance improves by an order of
magnitude
![Page 5: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/5.jpg)
Obvious Need forObvious Need for
A level playing field for debit-credit
A non-vendor organization to carry debit-credit
forward
Enter TPC and TPC-A
![Page 6: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/6.jpg)
Characteristics of DebitCharacteristics of Debit--CreditCredit
Pressing need
for better OLTP performance
Application focused
Cash a check
Simple
5 commands, 5 pages of specification
Result was vendor focus and much better
OLTP systems
![Page 7: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/7.jpg)
MetaMeta -- CharacteristicsCharacteristics
Find a Pressing need
Find a simple Application
Focus the vendor community
To provide better Systems
PAFS!
![Page 8: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/8.jpg)
TPCTPC--H (PH (PAAFS)FS)
Application/schema doesn’t correspond to an
obvious business problem
schema seems unnatural
see Pat’s O’Neil’s talk
![Page 9: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/9.jpg)
TPCTPC--H (PH (PAAFS)FS)
Way too many queries (22)
And queries seem politically gerrymandered
Can’t use materialized views
![Page 10: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/10.jpg)
TPCTPC--H (H (PPAFS)AFS)
No load component in TPC-H
Users want the ability to perform
incremental/trickle load
![Page 11: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/11.jpg)
TPCTPC--H (H (PPAFS)AFS)
Out-of-box experience awful for most
systems
Data base design way too hard – too many
knobs
And automatic tools don’t work very well
RDBMS considered too hard to use by many
![Page 12: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/12.jpg)
TPCTPC--H (H (PPAFS)AFS)
Scalability over a range of sizes is a big
issue
Ability to add resources on the fly is a big
issue
![Page 13: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/13.jpg)
TPCTPC--H (H (PPAFS)AFS)
Nobody recovers from the data base log
No replication in TPC-H
![Page 14: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/14.jpg)
TPCTPC--H (H (PAPAFFSS))
Major warehouse vendors (e.g. Teradata,
Netezza) ignore TPC-H
Analysts (Forrester, Gartner) say TPC-H is
irrelevant
![Page 15: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/15.jpg)
TPCTPC--H (PAFH (PAFSS))
Current leaders run on silly hardware
configuations
E.g. 1 Terabyte of disk for a 30 Gbyte
configuration (32 X)
![Page 16: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/16.jpg)
TPCTPC--HH
A failure by PAFS standards
At the very best is “long in the tooth”
Follow-on effort (TPC-DS) is worse by PAFS
standards
And TPC progress is at the speed of molasses
![Page 17: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/17.jpg)
TPCTPC--HH
A failure by PAFS standards
At the very best is “long in the tooth”
Follow-on effort (TPC-DS) is worse by PAFS
standards
And TPC progress is at the speed of very slow
molasses
E.g. little stomach to fix these issues
![Page 18: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/18.jpg)
TPCTPC--CC
Essentially same comments apply
![Page 19: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/19.jpg)
Summary of TPCSummary of TPC
Is very slow moving
Seems vendor dominated
Political and not user focused
Not focused on PAFS
![Page 20: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/20.jpg)
So What to Do?So What to Do?
Go back to your roots
E.g. PAFS
In your traditional market
In new markets
![Page 21: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/21.jpg)
ExampleExample –– One Among ManyOne Among Many
Science applications (e.g. Chemistry, Earth
Sciences, Remote Sensing, ….)
Universally hate current RDBMS
![Page 22: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/22.jpg)
Nearest neighbor queries, time series queries
![Page 23: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/23.jpg)
Snow Cover in the SierrasSnow Cover in the Sierras
![Page 24: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/24.jpg)
Protein Structure
![Page 25: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/25.jpg)
Chromatin Structure
![Page 26: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/26.jpg)
DNA
![Page 27: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/27.jpg)
Human Genome MatchingHuman Genome Matching
http://genome.ucsc.edu/ENCODE/encode.hg18.html
![Page 28: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/28.jpg)
Why?Why?
Wrong data model
Remote sensing guys want arrays
Which are horribly inefficient and usually
very unnatural to simulate on top of tables
![Page 29: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/29.jpg)
Why?Why?
Wrong operations
Consider two satellite imagery data sets, one
with 50m cells in lat-long and one with 75
meter cells in mercator
Need to regrid one to the other as a DBMS
operation
Regrid needs to be built in
![Page 30: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/30.jpg)
Why?Why?
Wrong features
Need provenance (i.e. ability to tell how a
data element was derived)
Requires a log of all operations and some
provenance-oriented operations
And repeatability (i.e. rederive the scientific
calculation if necessary)
Requires no-overwrite storage and time-
travel
![Page 31: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/31.jpg)
Net ResultNet Result
Science does not use RDBMS (for anything
other than metadata)
Crying need not being met by current systems!
A PAFS effort by TPC could change all this!!
![Page 32: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/32.jpg)
Same StorySame Story
In RDF
In Web 2.0 companies
In real-time data manipulation
In Map-Reduce style computing
![Page 33: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/33.jpg)
So What is the Best Route Forward?So What is the Best Route Forward?
Best benchmarks are written by one person
(e.g. debit-credit)
Typically in small numbers of days
And reviewed by the community in small
numbers of weeks
And adopted in months (not years or
decades)
![Page 34: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/34.jpg)
So What is the Best Route Forward?So What is the Best Route Forward?
There are lots of academic benchmarks that fit
this model and have gained traction, e.g.
Linear road (streaming data)
MR benchmark (MR vs DBMS)
Madden/Abadi RDF benchmark
![Page 35: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/35.jpg)
So What is the Best Route Forward?So What is the Best Route Forward?
Troll the research world for such things
![Page 36: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/36.jpg)
So What is the Best Route Forward?So What is the Best Route Forward?
Involve research community in your activities
But nobody will do so with your current
heavyweight process
you will have to violently streamline
![Page 37: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/37.jpg)
So What is the Best Route Forward?So What is the Best Route Forward?
Switch from a vendor-focus to a user-focus
Only way to get PA in PAFS
![Page 38: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/38.jpg)
I.e. It is Time for TPC to Reinvent ItselfI.e. It is Time for TPC to Reinvent Itself
Mantra has to be PAFS
Streamline process
Involve research community
New charter!
Everybody should do this once a decade – you
are a decade late
![Page 39: New Direction for TPC by Michael StonebrakerSo What is the Best Route Forward?So What is the Best Route Forward? Best benchmarks are written by one person (e.g. debit-credit) Typically](https://reader033.vdocuments.site/reader033/viewer/2022041920/5e6b9df96f0a2a09c6133511/html5/thumbnails/39.jpg)
OtherwiseOtherwise
TPC will become a legacy world only relevant in
some traditional business data processing areas
i.e. you will walk into the sunset of irrelevance