Transcript
Page 1: Geospatial Stream Query ProcessingpQy g using Microsoft ...Geospatial Stream Query ProcessingpQy g using Microsoft SQL Serverusing Microsoft SQL Server i Mi f SQL S SIihStreamInsight

Geospatial Stream Query ProcessingGeospatial Stream Query Processingp Q y gi Mi f SQL S S I i husing Microsoft SQL Server StreamInsightusing Microsoft SQL Server StreamInsight

1 1 2 1 1Seyed Jalal Kazemitabar

1Ugur Demiryurek

1Mohamed Ali

2 Afsin Akdogan

1 Cyrus Shahabi

1y g y g y

1I t t d M di S t C t 2Mi ft SQL S1Integrated Media Systems Center 2Microsoft SQL Server University of Southern California Microsoft Corporation ICampus IWatch CTy p ICampus IWatch CT

Streaming EngineIntroduction Streaming Engine

GeoInsight• StreamInsight Architecture

g• A real-world data-driven framework which enables:A real world data driven framework which enables:

– Fast query processing over stream data using Microsoft– Fast query processing over stream data using Microsoft StreamInsightTMStreamInsight

Running spatial queries over geospatial data– Running spatial queries over geospatial data

O li l i d di ti b d hi t i d t i i– Online analysis and prediction based on historic data using our in-k t hi t h imemory sketching technique

• Stream flow in demo

Q

er

Average

er

Q3

dapt

e

Value Filter Spatial Filter PCA PCA PredictRefineQ1 Q2 Q5

Average Ada

pte

Q4 Q6 Q7

put A

d Value Filter Spatial Filter PCA PCA, PredictRefine Average

tput

A

Inp

Out

Application Approachpp

O li A l ti l R fi t d P di ti (OARP)

pp

U i I Sk t hOnline Analytical Refinement and Prediction (OARP) Using In-memory SketchesHybrid queries over spatio-temporal windows provide great analysis • Instead of storing the whole data in DB, store the sketches in memory y q p p p g yfunctionality including:

g , yy g

• Principal component Analysis (PCA): a mathematical approach for analyzing• Refinement functions

• Principal component Analysis (PCA): a mathematical approach for analyzing correlated data• Refinement functions correlated data

– Smoothing noisy input data according to previously observed patternsA b f t ith t i fl

g y p g p y p

D t ti f li h t i d b di th t hi hl• A number of components with great influence

– Detection of anomalies characterized by sensor readings that are highly d i t d f hi t i l l

selected as coordinatesdeviated from historical mean values

• Improving PCA performance for aggregate queries by• Prediction functions

Improving PCA performance for aggregate queries by

calculating the query result in transformed space

P di ti f t t d b d i l b d tt

calculating the query result in transformed space

– Predicting near future trends based on previously observed patterns

– Responding to anomalies and deliberately attempting to change future conditions

Contribution/ExperimentsContribution/Experiments

PCA for Traffic DataPCA for Traffic Data

Hi h d t i t• High data compression rate

– 98% for highway data

• Extra short response time

Challengesp

– 2 milliseconds (compare to 58 sec.)Challenges 2 milliseconds (compare to 58 sec.)

• Highly accurate for Traffic DataLarge Datasets and Spatial Queries

• Highly accurate for Traffic Data

MSE for same query: 10-4 Mphg p Q

• Large response time caused by disk I/O limits the availability of hybrid– MSE for same query: 10-4 Mph

Large response time caused by disk I/O limits the availability of hybrid queries in real-time streaming applications Real Data Transformed Dataqueries in real time streaming applications

“What was the average speed in I-10 in LA county during summer 2009 from 4:00-5:00 pm?”98% ta98%

eed

e in

dat

Spe

aria

nce

% o

f Va

Response Time for the indexed %

ComponentsDatabaseResponse Time for the indexedtable containing data of one

Time TimeComponentsg

year (150 GB) : 58 Seconds!

Conclusion and Future Work

• Limited support for geostreaming (continuous spatial queries) in current D li ti f f t f t hi h ti l i

pp g g ( p q )database technologies Demo application as a proof of concept for a system which runs spatial queries over

real time datag

real-time data

Implementing the fundamentals of Clever Transportation (CT) project as a platform for monitoring, querying, and analyzing real-time Los Angeles traffic data

• Devising a scalable spatial alarm continuous query suitable for location-basedDevising a scalable spatial alarm continuous query suitable for location based services

Top Related