data stream management systems 1. 2 data streams data sets traditional dbms – data stored in...
TRANSCRIPT
2
Data Streams
Traditional DBMS – data stored in finite, persistent data setsdata sets
New Applications – data input as continuous, ordered data streamsdata streams
3
Applications ?
Network monitoring and traffic engineeringHealthcare monitoringNetwork security Financial applicationsSensor networksManufacturing processesWeb logs and click streamsMassive data sets
4
Sample Applications
Network security (e.g., iPolicy, NetForensics/Cisco, Niksun) Network packet streams, user session information Queries: URL filtering, detecting intrusions & DOS
attacks & viruses
Financial applications (e.g., Traderbot) Streams of trading data, stock tickers, news feeds Queries: arbitrage opportunities, analytics, patterns
5
Data Stream Management SystemUser/ApplicationUser/Application
Register QueryRegister Query
Stream QueryProcessor
ResultsResults
Scratch SpaceScratch Space(Memory and/or Disk)(Memory and/or Disk)
DataStream
ManagementSystem
(DSMS)
6
Meta-Questions
Killer-apps Application stream rates exceed DBMS capacity? Can DSMS handle high rates anyway?
Motivation Need for general-purpose DSMS? Not ad-hoc, application-specific systems?
Non-Trivial DSMS = merely DBMS with enhanced support for triggers,
temporal constructs, data rate mgmt?
7
DBMS versus DSMS Persistent relations One-time queries Random access (pull) “Unbounded” disk store Only current state matters Passive repository Relatively low update rate No real-time services Assume precise data Access plan determined by
query processor, physical DB design
Transient streams Continuous queries Sequential access (push) Bounded main memory History/arrival-order is
critical Active stores Multi-GB arrival rates Real-time requirements Data stale/imprecise Unpredictable/variable
data arrival & charact.