automatically annotating and integrating spatial datasets chieng-chien chen, snehal thakkar, crail...

22
Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science &Information Technologies University of Southern California Discussant: Oncel Tuzel

Upload: gerald-riley

Post on 16-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Automatically Annotating and Integrating Spatial Datasets

Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock,Cyrus Shahabi

Department of Computer Science &Information TechnologiesUniversity of Southern California

Discussant: Oncel Tuzel

Page 2: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Outline

• Problem Definition

• Finding Control Points

• Filtering Control Points

• Integration of Data Sources

• Performance Evaluation

• Conclusion

Page 3: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Problem Definition

• Automatic integration of data sources having:– Different projections– Different accuracy– Different formats

• Application– Building Finder– Road Extraction– Etc.

Page 4: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Data Sources• Microsoft Terraservice

– Satellite Image– Feature Points

• Feature name• Type• Lattitude/Longitude

• TIGER/Line Files (A digital database of geographic features, such as roads, railroads, rivers, lakes, legal boundaries, census statistical boundaries, etc. covering the entire United States.) – Name– Type of feature– Latitude/Longitude– Address, etc…

Page 5: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Data Sources• Online data / Yellow pages

– Type– Name– Address

White lines: Roads from TIGER/Line data source

Image: MS Terraservice satellite image

Page 6: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Finding Control Points

• Control point pair consists of a point in one dataset and a corresponding point in the other dataset.

• Determines accuracy of the algorithm.• Used to transform arbitrary points from one dataset to other.

Methods:• Using Online Data• Analyzing Imagery Using Vector Data

Page 7: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Control Points Using Online Data

Page 8: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Control Points Using Online Data

• Method– For a given location TerraService dataset has accurate control points

(churches, libraries, hospitals, etc.)

– Find the corresponding control points in Tiger/Lines dataset

– Search landmark categories on yellow page sources

– Get the address of the landmark find the address in Tiger/Lines DB

– Match the names of the landmarks and find matching control points

• Problems– Inaccuracies in yellow pages

– Landmarks are not uniformly distributed

– Landmarks may have large areas

Page 9: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Control Points Using Online Data

Terraservice DB

Yellow pages, Tiger/Line DB integrated

Page 10: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Control Points Analyzing Imagery Using Vector Data

• Road intersections may be good control points• Use computer vision techniques to find the roads intersections on

satellite image• Find intersections in Tiger/Line files• Match control points• Automatically extracting road intersections on large images are:

– Time consuming

– Inaccurate

Proposed Method: Localized Image Processing

Page 11: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Localized Image Processing

• Mark the locations of the intersections points found from Tiger/Line DB on satellite image

• Define the area size parameter– Start with a small area size, increase the area size until meet some

clear features

• Search the region centered at marked point having given area size• Find the edges on the given region• Mark the intersection of detected lines

• Smaller search region – easier

– faster

Page 12: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Filtering Control Points

• Both methods may generate inaccurate points• Inaccurate points reduce the accuracy of alignment of data sets• Inaccurate control points are detected by identifying pairs having

significantly different relationship than the other pairs

Vector Median Filter• Represent each control point pair by a 2D displacement vector• Median vector is the vector that has the least summed distance to

other points• Finds the correct median if pairs are accurate• Modified to get the k nearest vectors to the median

2

N

Page 13: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Vector Median Filter

• As k increases provides more control points, but there may be more inaccurate pairs

• A natural choice is to select

2

Nk

Page 14: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Conflating Imagery And Vector Data

• Arbitrary points on one of the data set is transferred to the other using the extracted control points

• Delaunay Triangulation and piecewise linear rubber sheeting are utilized for transformation

Triangulation• Alignment according to local adjustments is proposed• The domain is partitioned into small pieces (triangles)• Delaunay Triangulation is used

– Maximizes the minimum angle of all the angles in the triangulation

– Avoids triangles with small angles

– Built in O(nlogn) time

Page 15: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Conflating Imagery And Vector Data

Piecewise Linear Rubber Sheeting• Find the transformation coefficients to map triangulation of vector

data to imagery • Apply the same coefficients to the ends of road segments of vector

data • Construct the road network on satellite image• Since Delaunay Triangulation avoids triangles with small angles,

there is less distortion

Region Growing• Used to find control points where there is no landmarks or

intersections• Extrapolation on current control points is performed

Page 16: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Performance Evaluation

• Tests are performed integrating vector data to satellite imagery• Evaluation is performed according to generation of control points

and effect of filtering• Hypothesis

– Automated conflation using automatically generated control points without filtering improves accuracy of road identifications

– Filtering technique further improves the results

– Best results are achieved with localized image processing with vector filtering

Page 17: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Experimental Setup

• Microsoft TerraService web server was used to query satellite images

• Tiger/Line files were used as the vector data• There are spatial inconsistencies between the data sets• Accurate roads are generated by conflating vector data with

manually selected control point pairs• The experiments are performed by measuring the displacement

between the conflated road endpoints and accurate road endpoints• Results are given for both control point generation method

with/without filtering• Tests are performed on two different locations having 300/500 road

end points

Page 18: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Online Data vs. Intersection Points

Page 19: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Filtered Vs. Unfiltered Control Points

Page 20: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Results

VMF Filtered Online Data VMF Filtered Intersection Pts.

Page 21: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

Conclusion

• An automated integration approach is designed and implemented• Results show improvement on road identification

• Does not offer a general mechanism• Accurate roads may be marked manually on the satellite image???• Different transformations may be applied on arbitrary points???

Page 22: Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science

The End