telegraph: a universal system for information
DESCRIPTION
Telegraph: A Universal System for Information. Telegraph History & Plans. Initial Vision Carey, Hellerstein, Stonebraker “Regres”, “B-1” Sweat, ideas and further vision 4 of my grads committed Brewer + 2 grads committed Franklin will play obvious tie-ins with other projects. & synergies!. - PowerPoint PPT PresentationTRANSCRIPT
Telegraph: A Universal System for Information
Telegraph History & Plans
• Initial Vision– Carey, Hellerstein, Stonebraker– “Regres”, “B-1”
•Sweat, ideas and further vision– 4 of my grads committed– Brewer + 2 grads committed– Franklin will play– obvious tie-ins with other projects
Telegraph Architecture
Query/Browse/Mine
Global Agoric Federation
Continuously Reoptimizing Query ProcessorAdaptive Data Placement
Storage Manager (FS, DB, Web)
Ninja,GiST, IStore
River, Ninja,Aetherstore, Control,STIX
Mariposa,Millenium, Control
Control, DigLib
& synergies!
Storage Manager• Historic chance to start over!
– new hardware realities• variable-length segments, not blocks• big main memories• extra CPUs at the devices (IStore)
– revisit and clean up infrastructure for transactions• clean API supporting both log-based & version-based schemes;
version-based runs today!• big SW Eng. challenge
– unify DB/FS/Web server!• Clients: Ninja’s persistent hash table, query processing, web server,
Linux (NT?) filesystem.– (Mohan Lakhamraju, Rob von Behren, Steve Gribble)
Query Engine
• Shared-nothing (cluster)– all data flow (no blocking ops)
• auto load-balance to micro/macro changes in environment
• adaptivity more important than raw performance!!• CONTROL! || ripple join, online reordering• (Shankar Raman)
– continuously reoptimizing query plans• tie-ins with STIX (Christos/Sinclair/Russell/Hellerstein)• (Ron Avnur)
– first steps in handling streaming sources
Cluster Data Layout
– issues: fragmentation, placement, replication on 10^6 disks. For DB/FS/Web.
– goals: availability, efficiency, consistency, manageability.
– Adaptivity: cooperative vs. competitive ($$) techniques?
– (Mehul Shah)
Global Federation
• Global distribution – federated DBMS layer a la Mariposa/Cohera
• address all the hard stuff they dropped!
– Global data placement• as in cluster, but must be competitive. (Mehul Shah)
– Global query processing (Amol Deshpande)• Agoric query optimization• distributed query processing
– Global metadata• yellow pages both for services & datasets• Millenium/Ninja tie-ins?
Applications
• Really finding stuff in all the world’s data?– UI meets AI meets Logic (browse/mine/query)
• CONTROL is key: seamless, non-blocking interaction• multi-res output and feedback during browse/query• hints, wizards, training (AI mining, user in the loop)• build on existing “scalable spreadsheet”/xform tools
(Shankar Raman)