queries over streaming sensor data sam madden db lunch october 12, 2001
TRANSCRIPT
![Page 1: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/1.jpg)
Queries over Streaming Sensor Data
Sam MaddenDB Lunch
October 12, 2001
![Page 2: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/2.jpg)
Outline Background Server Side Solutions
Fjords, Sensor Proxies, CACQ Sensor Side Solutions
Catalog Management Aggregation
Future Work
![Page 3: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/3.jpg)
Background: Sensor Networks
![Page 4: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/4.jpg)
Sensor Networks Small, low cost battery powered
microprocessors with 1 –4 sensors Light, temperature, vibration, acceleration,
AC power, humidity. 10 kBit – 1Mbit wireless networks, 100ft
range. “Ad-hoc” networking – no predefined
routes. Cal, MIT, UCLA OS and networking
communities committed
![Page 5: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/5.jpg)
SmartDust Sensor nets motivated by
“SmartDust Vision” – millimeter scale microprocessors, sensor, and wireless communication for pennies.
Deployed in thousands, no concern for reliability of a single sensor.
Requires: position detection, fault tolerance, aggregation, etc.
![Page 6: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/6.jpg)
Rene / Mica Motes SmartDust stand-in ~2cm x 3cm, OTS.
Processor
Atmel 8535 4Mhz, 5 mA
Radio RFM TR1000 911 Mhz, 10kBits~25 mJ/msg,20-30 msg / sec
Memory 512B RAM, 8k Flash, 32k EEPROM
Flash R/OEEPROM slow
Power 575 mAh battery Peak load: 19.5 mA, Idle 3.1 mA, sleeping 10uA.
![Page 7: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/7.jpg)
TinyOS Lightweight OS for sensors
Event-based Active-message, multi-hop networking
Auto-idling Network reprogramming, time
synchronization, etc.
[18] J. Hill, R. Szewczyk, A. Woo, S. Hollar, and D. C. K. Pister. System architecture directions for networked sensors. In Proceedingsof the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, November 2000.
![Page 8: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/8.jpg)
Applications of Sensor Nets• Space Monitoring
• Power, light, temp in buildings
• Temperature, humidity
• Traffic
• Military
• Structural
• Personal Networks
![Page 9: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/9.jpg)
Database Opportunities All applications depend on data
processing Declarative query language over
sensors attractive Want “to combine and aggregate
data streaming from motes.” Sounds like a database…
![Page 10: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/10.jpg)
Database Challenges Sensors unreliable
Come on and offline, variable bandwidth
Sensors push data Sensors stream data Sensors have limited memory,
power, bandwidth Sensors have processors
![Page 11: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/11.jpg)
Outline Background Server Side Solutions
Fjords, Sensor Proxies, CACQ Sensor Side Solutions
Catalog Management Aggregation
Future Work
![Page 12: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/12.jpg)
Fjords
Query Plan Abstraction to handle lack of reliability and streaming, push based data
Combine push and pull in arbitrary combinations Use connectors between operators to isolate
them from flow direction “Bracket Model” – Graefe ‘93
![Page 13: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/13.jpg)
Fjords (Continued) Operators assume non-blocking queue
interface between each other. Queues implement push vs. pull
Pull from A to B : Suspend A, schedule B until it produces data. A cannot go forward until B produces data.
Push from B to A : A polls, scheduler thread invokes B until it produces data. A can process other inputs while waiting for B.
Supports parallelism between operators via queues, state machines, and OS (e.g. NIC buffers, DMA) in operator transparent way.
![Page 14: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/14.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 15: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/15.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 16: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/16.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 17: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/17.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 18: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/18.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 19: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/19.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 20: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/20.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 21: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/21.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 22: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/22.jpg)
Fjords Example
Push
Push
Pull
Samuel Madden, Michael J. Franklin. Fjording The Stream: An Architecture For Queries Over Streaming Sensor Data. International Conference on Data Engineering, 2002. To Appear, Feburary 2002.
![Page 23: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/23.jpg)
Fjords Applications Combine traffic streams with web-based
accident reports
Francis Li, Sam Madden, Megan Thomas. Traffic Visualization. http://www.cs.berkeley.edu/~mct/infovis/project/traffic.html
![Page 24: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/24.jpg)
Operators for Streaming Data Need special operators for dealing
with streams (See P. Seshadri, et al. The design and
implementation of a sequence database systems..VLDB ’96) In particular, streams can’t be joined or
sorted in the traditional sense Solution: Use windows – e.g. “Zipper Join”
![Page 25: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/25.jpg)
Sensor Proxy Energy-sensitive database operator
Buffer sensor tuples and route to multiple user queries to hide query load from sensors
Push aggregation operators into sensors to reduce communications load
Dynamically adjust sample rate based on user demand
Push results into Fjords so that other operators don’t block waiting on slow or dead sensors
![Page 26: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/26.jpg)
Some Results Pushing predicates into sensors
can vastly reduce costs:
Power Drain (W) vs. Sample Method
00.0010.0020.0030.0040.0050.0060.0070.008
Every Sample Every Vehicle
Sampling Method
Po
wer
(W
)
Atmel Simulator
100 samples / sec
5 vehicles / sec
7x power savings
![Page 27: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/27.jpg)
CACQ Expect hundreds to thousands of
queries over same sensor sources Continuously Adaptive Continuous
Queries Continuous Queries: Long running queries
which combine selections and joins to improve efficiency (See Chen, NiagaraCQ, SIGMOD 2000)
Stocks.
symbol = ‘MSFT’
Stocks.
symbol = ‘APPL’
Query 2
Query 1
Stock Quotes
‘MSFT’
‘APPL’
Stock Quotes
![Page 28: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/28.jpg)
CACQ (Cont.) Continuous Adaptivity From Eddies Route tuples differently, depending
on selectvity and cost estimates of operators
staticdataflow
eddy
![Page 29: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/29.jpg)
CACQ (cont.) Combining CA with CQ is a win:
CQ increases number of simultaneous queries
Adaptivity well suited to long running queries
Eddies allow us to avoid ugly query-optimization phase in traditional CQ
Eddies + Streams == few copies, unlike traditional CQ
![Page 30: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/30.jpg)
CACQ (cont)
Look for a paper in SIGMOD 2002 (fingers crossed!)
![Page 31: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/31.jpg)
Outline Background Server Side Solutions
Fjords, Sensor Proxies, CACQ Sensor Side Solutions
Catalog Management Aggregation
Future Work
![Page 32: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/32.jpg)
Sensor Side Solutions CACQ + Fjords provides interface
+ performance on QP, but sensors still need help: Locate / identify sensors Reduce power consumption
Take advantage of processors? Improve responsiveness
![Page 33: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/33.jpg)
Cataloging Sensors To query sensors, need a way to
locate, identify properties, extract values
Goal: Drop a bunch of sensors around the DBMS, allow them to be queried without manual effort
Idea: Add a layer to each sensor which advertises its capabilities
![Page 34: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/34.jpg)
Catalog (Continued)#temperature sensor field {
name : "temp" #optionaltype : int units : celsiusmin : -20 max : 100 bits : 8 sample_cost : 10.0 J #optional -- for use in costing sample_time : 10.0 ms #optional -- for use in costing input : adc2 #optional : read from adc channel 1 sends : ondemand accessorEvent : GET_TEMPERATURE_DATA responseEvent : TEMPERATURE_DATA_READY
}
Compiled in 27 bytes of memory
Layer to register with telegraph
Can be “push” or “pull”
![Page 35: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/35.jpg)
Aggregating Over Sensors Sensor Proxy combines user
queries, pushes down aggregates Goal: Save energy, increase
efficiency Idea: Take advantage of the
routing hierarchy (example soon!)
![Page 36: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/36.jpg)
Why bother with aggregation Individual sensor readings are of limited use
Interest in higher level properties, e.g. what vehicles drove through, what is the spread of temperatures in the building
We have a processor & network on board, lets use it We cannot survive without aggregation
Delivering a message to all nodes much easier than delivering a message from each node to a central point
Delivering a large amount of data from every node harder still, vide connectivity experiment
Forwarding raw information too expensive Scarce energy Scarce bandwidth Multihop performance penalty
![Page 37: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/37.jpg)
Aggregation challenges Inherently unreliable environment, certain information
unavailable or expensive to obtain how many nodes are present? how many nodes are supposed to respond? what is the error distribution (in particular, what about malicious
nodes?) Trying to build an infrastructure to remove all uncertainty from
the application may not be feasible – do we want to build distributed transactions?
Information trickles in one message at a time Never have a complete and up-to-date information about the
neighborhood What type of information should we expect from aggregation
Streams Robust estimates
![Page 38: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/38.jpg)
2
1
3
4
5
Scenario: Count
![Page 39: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/39.jpg)
2
1
3
4
5
Scenario: Count
Goal: Count the number of nodes in the network.
Number of children is unknown.
1 2 3 4 5- - - - -
- - - - -
- - - - -
- - - - -
- - - - -
- - - - -
- - - - -
Sensor #
Time
![Page 40: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/40.jpg)
2
1
3
Scenario: Count
Goal: Count the number of nodes in the network.
Number of children is unknown.
1 2 3 4 51 - - - -
- - - - -
- - - - -
- - - - -
- - - - -
- - - - -
- - - - -
Sensor #
Time
![Page 41: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/41.jpg)
2
1
3
4
Scenario: Count
Goal: Count the number of nodes in the network.
Number of children is unknown.
1 2 3 4 51 - - - -
1 1 1 - -
1 + 2
1 1 - -
- - - - -
- - - - -
- - - - -
- - - - -
Sensor #
Time
![Page 42: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/42.jpg)
2
1
3
4
5
Scenario: Count
Goal: Count the number of nodes in the network.
Number of children is unknown.
1 2 3 4 51 - - - -
1 1 1 - -
1 + 2
1 1 1 -
1 + 2
1 + ½
1 + ½
1 -
- - - - -
- - - - -
- - - - -
Sensor #
Time
![Page 43: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/43.jpg)
2
1
3
4
5
Scenario: Count
Goal: Count the number of nodes in the network.
Number of children is unknown.
1 2 3 4 51 - - - -
1 1 1 - -
1 + 2
1 1 1 -
1 + 2
1 + ½
1 + ½
1 1
1+3 1+ ½
1+ ½
1+1 1
- - - - -
- - - - -
Sensor #
Time
![Page 44: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/44.jpg)
2
1
3
4
5
Scenario: Count
Goal: Count the number of nodes in the network.
Number of children is unknown.
1 2 3 4 51 - - - -
1 1 1 - -
1 + 2
1 1 1 -
1 + 2
1 + ½
1 + ½
1 1
1+3 1+ ½
1+ ½
1+1 1
1+3 1+2/2
1+2/2
1+1 1
- - - - -
Sensor #
Time
![Page 45: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/45.jpg)
2
1
3
4
5
Scenario: Count
Goal: Count the number of nodes in the network.
Number of children is unknown.
1 2 3 4 51 - - - -
1 1 1 - -
1 + 2
1 1 1 -
1 + 2
1 + ½
1 + ½
1 1
1+3 1+ ½
1+ ½
1+1 1
1+3 1+2/2
1+2/2
1+1 1
1+4 1+2/2
1+2/2
1+1 1
Sensor #
Time
![Page 46: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/46.jpg)
Counting Lessons Take advantage of redundancy to
improve accuracy (reply to all parents, not just one)
Use broadcast to reduce number of messages
Result is a stream of values: much more robust to failures, movement, or collision than a single value.
![Page 47: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/47.jpg)
Aggregation in network programming Network programming problem
Reliable delivery of a large number of messages to all nodes in range, while exploiting the broadcast nature of the medium
Basic setup Broadcast a known number of idempotent program fragments Each node keeps a bitmap of fragments received (1=packet
received) Two stages of the problem: single hop, and multihop
Solutions Single hop, dense cell
Broadcasting the program – trivial, the central node broadcasts Feedback from nodes – broadcast a request from the central node:
Is anyone missing packets in this packet range? Convergence: no replies to the request
![Page 48: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/48.jpg)
Aggregation in multihop network programming Broadcasting the program – use flooding
Remember the last 8 packets forwarded, use that cache to decide whether to forward or not
Feedback from nodes Distribute requests for feedback using the flooding After some delay, respond if any packets are missing
locally Responses from children: AND with the local bitmap,
store the result locally, forward the request Suboptimal because there is no local fixups
Convergence No replies to the request
![Page 49: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/49.jpg)
Aggregation over streams Inherent uncertainty of the system
Can nodes communicate, do they have enough power, have they moved?
computing a complete single answer can be very expensive, and may not be possible
Partial estimates have their own value Aggregation over streams
Values reflect the current best estimates Self stabilizing: in the absence of changes
converges to a desired value within N steps
![Page 50: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/50.jpg)
What does it mean to aggregate(The DB Perspective)
General purpose solution: apply standard aggregation operators like COUNT, MIN, MAX, AVERAGE, and SUM to any set of sensors.
Previous example are application specific In sensors, operators may be arbitrary signal processing
functions Provide grouping semantics: e.g. ‘select avg(temp) group
by trunc(light/10)’ In sensor networks, groups may be random samples
t1 t2 t3
t4 t5 t6
t7 t8 t9
![Page 51: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/51.jpg)
Identifying Groups Need a way to identify groups
Idea: set of membership criteria pushed down Nodes determine their membership set based on those
criteria Nodes can be in multiple but not unlimited groups E.g. “Group 1 : 0 <= t < 10, Group 2 : 10 <= t < 20, …”
Need a way to evaluate aggregation predicates by group
May want to allow grouping and aggregation predicates to be expressed together to take advantage of broadcast effects
![Page 52: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/52.jpg)
Local Query Rewrite Intermediate nodes may determine that its
faster to evaluate an aggregate by asking children a different question.
Example 1: MAX(t). Once we have a guess T for MAX, ask children to report iff t > T, rather than asking all children to compute a local maximum.
Example 2: Network programming. Rather than asking nodes what packets they have, ask them to report iff packets missing.
Is this a general technique? Maybe: Inform child of guess at aggregate, ask it to refute.
Works for average (within error bound), not count.
![Page 53: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/53.jpg)
Wins and pitfalls of aggregation Aggregation over natural network topology
Aggregation over an arbitrary subset of the network may be a loss
Really dense cells Aggregation does not help with the starvation
problem Use the message suppression via query rewrite
technique Still beneficial in a multihop scenario
![Page 54: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/54.jpg)
Advanced Aggregation Tricks Break the Network Protocol
Boundary Use analog reading from channel
over time to determine aggregates. Simple example:Time
Sum
Reading = 11 = 110100
Reading = 21 = 101010
Reading = 32 = 2 + 2 + 4 + 8 + 16
![Page 55: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/55.jpg)
Outline Background Server Side Solutions
Fjords, Sensor Proxies, CACQ Sensor Side Solutions
Catalog Management Aggregation
Future Work
![Page 56: Queries over Streaming Sensor Data Sam Madden DB Lunch October 12, 2001](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649dc55503460f94ab8896/html5/thumbnails/56.jpg)
Future Work DBMS Side
Efficient Catalog Management Moving Object Databases
Query Optimization Techniques Sensor Side
Efficient Grouping Joins over Network Topology Non Standard Aggregate Functions
Somewhere In Between Histograms and other Correlations Sampling and Compression for Streams