telegraph: a universal system for information

Telegraph: A Universal System for Information

Telegraph History & Plans

• Initial Vision– Carey, Hellerstein, Stonebraker– “Regres”, “B-1”

•Sweat, ideas and further vision– 4 of my grads committed– Brewer + 2 grads committed– Franklin will play– obvious tie-ins with other projects

Telegraph Architecture

Query/Browse/Mine

Global Agoric Federation

Continuously Reoptimizing Query ProcessorAdaptive Data Placement

Storage Manager (FS, DB, Web)

Ninja,GiST, IStore

River, Ninja,Aetherstore, Control,STIX

Mariposa,Millenium, Control

Control, DigLib

& synergies!

Storage Manager• Historic chance to start over!

– new hardware realities• variable-length segments, not blocks• big main memories• extra CPUs at the devices (IStore)

– revisit and clean up infrastructure for transactions• clean API supporting both log-based & version-based schemes;

version-based runs today!• big SW Eng. challenge

– unify DB/FS/Web server!• Clients: Ninja’s persistent hash table, query processing, web server,

Linux (NT?) filesystem.– (Mohan Lakhamraju, Rob von Behren, Steve Gribble)

Query Engine

• Shared-nothing (cluster)– all data flow (no blocking ops)

• auto load-balance to micro/macro changes in environment

• adaptivity more important than raw performance!!• CONTROL! || ripple join, online reordering• (Shankar Raman)

– continuously reoptimizing query plans• tie-ins with STIX (Christos/Sinclair/Russell/Hellerstein)• (Ron Avnur)

– first steps in handling streaming sources

Cluster Data Layout

– issues: fragmentation, placement, replication on 10^6 disks. For DB/FS/Web.

– goals: availability, efficiency, consistency, manageability.

– Adaptivity: cooperative vs. competitive ($$) techniques?

– (Mehul Shah)

Global Federation

• Global distribution – federated DBMS layer a la Mariposa/Cohera

• address all the hard stuff they dropped!

– Global data placement• as in cluster, but must be competitive. (Mehul Shah)

– Global query processing (Amol Deshpande)• Agoric query optimization• distributed query processing

– Global metadata• yellow pages both for services & datasets• Millenium/Ninja tie-ins?

Applications

• Really finding stuff in all the world’s data?– UI meets AI meets Logic (browse/mine/query)

• CONTROL is key: seamless, non-blocking interaction• multi-res output and feedback during browse/query• hints, wizards, training (AI mining, user in the loop)• build on existing “scalable spreadsheet”/xform tools

(Shankar Raman)

telegraph: a universal system for information

Documents