spatio-temporal networks: analyzing change across time and place
DESCRIPTION
Organizations are generating powerful insights by analyzing change in spatio-temporal networks in many applications, such as weather risk analysis. The rapid growth in size, variety and update rate of spatio-temporal data is creating new challenges and opportunities to efficiently store, validate, process and analyze spatio-temporal networks with large time-series data. This slide deck taken from our webinar on July 16, 2014 reviews the challenges and trends in big data analytics for spatio-temporal networks. Jeremy Peters, Principal Consultant, Pitney Bowes shows a proof-of-concept application for historical tornado event risk analysis that is implemented using the Pitney Bowes® Spectrum™ Technology Platform software for configuring and running batch job and real-time web services integrated with R Density based Spatial Clustering and SQL Server Spatial. Pitney Bowes Spectrum™ Spatial for Business Intelligence software is used to provide a web-based map viewer to analyze and visualize tornado event spatial clusters over time, as well as the magnitude of and property damage caused by tornado events through surface density mapping.TRANSCRIPT
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Welcome
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Spatiotemporal Networks:
Analyzing Change Across
Time and Place
Jeremy Peters
Principal Consultant
June 12, 2014
Every connection is a new opportunity™
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Agenda
Challenges and opportunities to efficiently store, process
and analyze spatio-temporal networks with large time
series data
Trends in Big Data Processing for Real-time Spatial
Analytics and Time-Series Analysis
Tornado Risk Analysis application demonstration using
Spectrum Spatial
3
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Spatio Temporal Networks
Captures both the time dependence of the data and the underlying connectivity of the locations
Many applications: commerce, transportation, electricity and gas distribution, telecommunication networks, air/water/land quality monitoring and weather risk analysis among many others
Rapid growth in size, variety, and update rate of spatio-temporal data
Challenges to efficiently integrate, store, validate, process and analyze spatio-temporal networks with large time series data
4
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Challenges & Opportunities: Big Data
Data is growing at about
59% a year globally
Growth is characterized by the following: The variety of data types being captured
The volumes of data being captured
The velocity or rate at which data is being generated
The veracity or trustworthiness of the data
Variety of Data Types Semi-structured data e.g. email, e-forms, HTML, XML
Unstructured data e.g. document collections (text), social interactions, images, video and sound
Sensor and machine generated data 6
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Challenges & Opportunities: Big Data Analytics
Analytical workloads associated with some combination of data volume, data velocity and data variety that may include complex analytics and complex data types
Analytical requirements and data characteristics dictate the technology deployed
Solutions may be implemented on a range of technology platforms including
• Stream processing engines
• Relational DBMS
• Analytical DBMS (DW appliances)
• Non-relational data management
platforms such as a commercialized
Hadoop platform
• Specialized NoSQL data store
e.g. a graph database
7
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Challenges & Opportunities: Analyzing Data
in Motion For Operational Decisions Purpose: to analyze events as they happen to detect patterns in the
data that impact (or are predicted to impact) on costs, revenue,
budget, risk, deadlines and customer satisfaction etc
Analysis of data may need to take place before this data is stored in a
database or a file system
Analysis has to be automated using a variety of analytic methods, such
as predictive and statistical models to determine or predict the
business impact of these events, given the velocity at which data is
generated and the volumes of data typically involved
Stream processing software, is used to
support the automatic analysis of data
in-motion in real-time or near real-time
to identify meaningful patterns in one or
more data streams and trigger action to
respond to them as quickly as possible
8
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Challenges & Opportunities: Analytical Processing
using Enterprise Information Management Software
Strengths: Ability to define data quality and data integration transforms in
graphical workflows
Analytics, rules, decisions and actions can be added into information
management workflows to create automated analytical processes
Workflows are service enabled making them available on-demand
Analytical processes exploit the appropriate analytical platform best suited
to the analytical workload(s) that make up the workflow
9
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Challenges & Opportunities:
Location Intelligence
Location Intelligence plays a important
role in organizing and using big data
Bringing Big Data Together: Data
related to a common location lets you
merge or "mash up" data from any
source with a spatial reference
Seeing Patterns and Trends: Seeing data and analyzing data in a map view with spatial analysis tools such as heat maps or spatial statistics
Finding Needles in the Haystack: Location and spatial relationships provide a powerful filter for selecting relevant data from the haystack
10
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Challenges & Opportunities: Fusing Space &
Time
Many applications such as map animation; change detection; movement tracking; and spatiotemporal clusters, simulation, and visualization
Integration of temporal data into LI databases and data access structures to allow structured queries to be performed on data's temporal, as well as spatial attributes
LI software can provide data processing, visualization and analysis tools for both the time and geographic dimension of data that helps expose important insights and provide actionable information
11
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Historical Tornado Risk Analysis Demo
POC for the analysis of historical tornado event data to visualize on a map
tornado event spatial clusters in relation to their magnitude and property
damage caused for a user defined study area and for a user-defined time
period
Web Service and Jobs
Add required time dimensions (e.g. Month Name) to a table of historical
tornado events in RDBMS
Select historical tornado events from RDBMS that took place within
specified dates, within specified months (e.g. Tornado season) and within
the specified drive distance or radius around a specified address (e.g.
study area)
Calculate spatial cluster for each tornado event using R Density based
DBScan Spatial Clustering
Map Output
Map selected tornado events over a Google maps base map thematically
mapped by Spatial Cluster and/ or Tornado events attribute, such as
property damage
13
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Populate a Time Dimension
Table
A time dimension table is a table in a database that makes it
possible to analyze historic data without using complex SQL
calculations
For example, analyze your data by workdays versus holidays,
weekdays versus weekends, by fiscal periods or by special
events
Used for accurate time-based calculations, such as calculating
average sales per day.
14
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Populate a Time Dimension
Table
Configure Generate Time Dimension stage to produce the
time dimensions you want
Configure the Write to DB stage to point to the database and
table where you want to create the time dimension table
15
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Spatio-temporal newtork
table Optimized for Analysis
16
• Create a new spatio-temporal table optimized for analysis that combines the records in the Time Dimension table and the spatio-temporal network data that will be analyzed (e.g. Tornado events)
– Configure two Read from DB stages to point to the database and the Time Dimension table and the spatio-temporal network data table that will be analyzed
– Configure the Record Joiner stage to perform a SQL-style Left Outer join operation on the date fields in the two tables
– Configures the Sorter Stage to sort the records by date
– Configure the Field Selector Stage to select the fields from both tables to include in the output
– Configure the Write to DB stage to point to the database and table where you want to create the new spatio-temporal table
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Create a named table in the
Management Console that references the Spatio-
temporal network table
Open Management Console.
Expand Modules > Location Intelligence > Tools then click
Named Tables.
Click Add. The Add Named Table dialog box appears.
Complete the Add Named
table dialog
17
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Web Service and Job to analyze
Spatio-temporal network data based on time and
spatial attributes
18
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Web Service to analyze Spatio-
temporal network data based on time and spatial
attributes
Service Input Stage:
StartDate & EndDate: Time period of tornado events to analyze
(e.g. 1985-01-01 to 2011-01-01)
Months: Months during the year of tornado events to analyze,
such as March, April, May to represent Tornado season (e.g.
345)
AddressLine1,City,StateProvince, PostalCode: U.S. Address that
represents the starting point of the study area to analyze
SearchDistance: Distance in miles that will be used to create a
drive distance or radius around the specified address
SearchType: Drive distance or radius around the specified
address (D or R)
19
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Web Service to analyze Spatio-
temporal network data based on time and spatial
attributes
Geocode US Address stage:
Perform U.S. address standardization, address geocoding, and
postal code centroid geocoding
Get Travel Boundary stage:
• Creates a drive distance boundary from the geocoded address
Spatial Calculator stage:
• Creates a geometry object which represents a buffered distance
around the geocoded address
Write Spatial Data stage:
• Write the search address point location and drive distance or
radius boundary to Spatial database tables
20
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Web Service to analyze Spatio-
temporal network data based on time and spatial
attributes
Query Spatial Data stage:
• Select historical tornado events from RDBMS using MISQL that
took place within the specified dates, within the specified months
and within the specified drive distance or radius around the
specified address • SELECT Date, MonthName, WeekdayName, Year, Event_ID, Time, TimeZone, Quarter,
County, State, Magnitude, Fatalities, Injuries, PropertyDamage, Longitude As long, Latitude as
lat from "/TornadoRiskAnalysisDemo/USTornadoTD" WHERE Date >= ${StartDate} and Date
<=${EndDate} And MonthNumber IN (Substring ( ${Months}, 1,1),Substring ( ${Months}, 2,1),
Substring ( ${Months}, 3,1),Substring ( ${Months}, 4,1),Substring ( ${Months}, 5,1), Substring (
${Months}, 6,1),Substring ( ${Months}, 7,1),Substring ( ${Months}, 8,1), Substring ( ${Months},
9,1),Substring ( ${Months}, 10,2),Substring ( ${Months}, 11,2), Substring ( ${Months}, 12,2))
and MI_Point(Longitude, Latitude, 'epsg:4326') within ${GeometryBuffer}
Custom R Statistical Computing and Data Analysis stage:
• Calculate spatial cluster for each tornado event using R Density
based DBScan Spatial Clustering
21
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Spatial-Temporal Analysis using R Density based Spatial
Clustering: DBScan Custom Stage
R is:
A free software programming language and software environment for statistical computing and graphics
Widely used among statisticians and data miners for developing statistical software and data analysis
Provides a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others
Extensible through functions and extensions
22
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Spatial-Temporal Analysis using R Density based Spatial
Clustering: DBScan Spectrum Custom Stage
Density-based spatial clustering of applications with noise (DBSCAN):
Finds a number of clusters starting from the estimated density distribution of corresponding nodes
One of the most common clustering algorithms and also most cited in scientific literature
Definition of a cluster is based on the notion of density reachability
A cluster, which is a subset of the points of the database, satisfies two properties: – All points within the cluster are mutually density-connected
– If a point is density-reachable from any point of the cluster, it is part of the cluster
Major features: – Discover clusters of arbitrary shape – Handle noise – One scan
23
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Web Service to analyze Spatio-
temporal network data based on time and spatial
attributes
Service Output Stage: • Tornado Event ID
• Spatial Cluster
• latitude
• longitude
• County
• State
• Magnitude
• Fatalities
• Injuries
• Property Damage
• Date
• Weekday name
• Month name
• Year
• Time
• Time Zone
26
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Implementation: Web Map Viewer to visualize the
Spectrum Job tornado event spatial clusters and
attribute output
• MI Express: standalone web map, chart and table viewer with any
Spectrum Spatial for BI license
• Demonstrates mapping capability that you get with any Spectrum
Spatial for BI integration with Cognos, Business Objects, QlikView or
MicroStrategy
• Used to map and analyze location information in your data, in CSV
format with latitude and longitude coordinates attached to each record
• Integrates Named Maps from Spectrum Spatial
• Uses OpenLayers, jQuery, HTML5 and Spectrum Javascript API to:
• Map input address location, drive distance polygon or radius
polygon around address and returned tornado events over base
map
• Thematically Map Tornado events by Spatial Cluster
• Thematically Map Tornado events by event attribute, such as
property damage
27
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Map Results
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Map Results
MapInfo User Conference 2014: GIS Gets Personal #MapInfoUC
Get the latest
MapInfo User Conference news:
Twitter via #MapInfoUC
and follow us @MapInfo