making use of openstreetmap data with python

65
Using OpenStreetMap data with Python Andrii V. Mishkovskyi June 22, 2011 Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 1 / 47

Upload: andrii-mishkovskyi

Post on 27-Jan-2015

130 views

Category:

Technology


7 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Making use of OpenStreetMap data with Python

Using OpenStreetMap data withPython

Andrii V. Mishkovskyi

June 22, 2011

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 1 / 47

Page 2: Making use of OpenStreetMap data with Python

Who is this dude anyway?

I love PythonI love OpenStreetMapI do map rendering at CloudMade using PythonCloudMade uses OpenStreetMap data extensively

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 2 / 47

Page 3: Making use of OpenStreetMap data with Python

Objectives

Understand OpenStreetMap data structureHow to parse itGet a feel of how basic GIS services work

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 3 / 47

Page 4: Making use of OpenStreetMap data with Python

OpenStreetMap

Founded in 2004 as a response to OrdnanceSurvey pricing scheme>400k registered users>16k active mappersSupported by Microsoft, MapQuest (AOL), Yahoo!Crowd-sourcing at its best

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 4 / 47

Page 5: Making use of OpenStreetMap data with Python

Why OSM?

Fairly easyGood qualityGrowing communityAbsolutely free

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 5 / 47

Page 6: Making use of OpenStreetMap data with Python

1 Data layout

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 6 / 47

Page 7: Making use of OpenStreetMap data with Python

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 7 / 47

Page 8: Making use of OpenStreetMap data with Python

Storage type

XML (.osm)Protocol buffers (.pbf, in beta status)Other formats through 3rd parties (Esri shapefile,Garmin GPX, etc.)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 8 / 47

Page 9: Making use of OpenStreetMap data with Python

The data

Each object has geometry, tags and changesetinformationTags are simply a list of key/value pairsGeometry definition differs for different typesChangeset is not interesting when simply using thedata (as opposed to editing)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 9 / 47

Page 10: Making use of OpenStreetMap data with Python

Data types

Node Geometric point or point of interestWay Collection of points

Relation Collections of objects of any type

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 10 / 47

Page 11: Making use of OpenStreetMap data with Python

Nodes

<node id="592637238" lat="47.1675211" lon="9.5089882"version="2" changeset="6628391"user="phinret" uid="135921"timestamp="2010 -12 -11 T19:20:16Z">

<tag k="amenity" v="bar" /><tag k="name" v="Black Pearl" />

</node>

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 11 / 47

Page 12: Making use of OpenStreetMap data with Python

Ways

<way id="4781367" version="1" changeset="102260"uid="8710" user="murmel"timestamp="2007 -06 -19 T06:25:57Z">

<nd ref="30604007"/><nd ref="30604015"/><nd ref="30604017"/><nd ref="30604019"/><nd ref="30604020"/><tag k="created_by" v="JOSM" /><tag k="highway" v="residential" /><tag k="name" v="In den ÃĎusseren" />

</way>

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 12 / 47

Page 13: Making use of OpenStreetMap data with Python

Relations<relation id="16239" version="699" changeset="8440520"

uid="122406" user="hanskuster"timestamp="2011 -06 -14 T18:53:49Z">

<member type="way" ref="75393767" role="outer"/><member type="way" ref="75393837" role="outer"/><member type="way" ref="75393795" role="outer"/>...<member type="way" ref="75393788" role="outer"/><tag k="admin_level" v="2" /><tag k="boundary" v="administrative" /><tag k="currency" v="EUR" /><tag k="is_in" v="Europe" /><tag k="ISO3166 -1" v="AT" /><tag k="name" v="ÃŰsterreich" />...<tag k="wikipedia:de" v="ÃŰsterreich" /><tag k="wikipedia:en" v="Austria" />

</relation >

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 13 / 47

Page 14: Making use of OpenStreetMap data with Python

2 Importing

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 14 / 47

Page 15: Making use of OpenStreetMap data with Python

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 15 / 47

Page 16: Making use of OpenStreetMap data with Python

Major points when parsing OSM

Expect faulty dataParse iterativelyCache extensivelyOrder of elements is not guaranteedBut it’s generally: nodes, ways, relationsIds are unique to datatype, not to the whole dataset

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 16 / 47

Page 17: Making use of OpenStreetMap data with Python

Parsing data

Using SAXDoing simple reprojectionCreate geometries using Shapely

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 17 / 47

Page 18: Making use of OpenStreetMap data with Python

Parsing dataProjection

import pyproj

projection = pyproj.Proj(’+proj=merc +a=6378137 +b=6378137 ’’+lat_ts =0.0 +lon_0 =0.0 +x_0 =0.0 +y_0=0’’+k=1.0 +units=m +nadgrids=@null +wktext ’’+no_defs ’)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 18 / 47

Page 19: Making use of OpenStreetMap data with Python

Parsing dataNodes

from shapely.geometry import Point

class Node(object ):

def __init__(self , id, lonlat , tags):self.id = idself.geometry = Point(projection (* lonlat ))self.tags = tags

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 19 / 47

Page 20: Making use of OpenStreetMap data with Python

Parsing dataNodes

class SimpleHandler(sax.handler.ContentHandler ):

def __init__(self):sax.handler.ContentHandler.__init__(self)self.id = Noneself.geometry = Noneself.nodes = {}

def startElement(self , name , attrs):if name == ’node’:

self.id = attrs[’id’]self.tags = {}self.geometry = map(

float , (attrs[’lon’], attrs[’lat’]))elif name == ’tag’:

self.tags[attrs[’k’]] = attrs[’v’]

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 19 / 47

Page 21: Making use of OpenStreetMap data with Python

Parsing dataNodes

def endElement(self , name):if name == ’node’:

self.nodes[self.id] = Node(self.id,self.geometry ,self.tags)

self.id = Noneself.geometry = Noneself.tags = None

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 19 / 47

Page 22: Making use of OpenStreetMap data with Python

Parsing dataWays

from shapely.geometry import LineString

nodes = {...} # dict of nodes , keyed by their ids

class Way(object ):

def __init__(self , id, refs , tags):self.id = idself.geometry = LineString(

[(nodes[ref].x, nodes[ref].y)for ref in refs])

self.tags = tags

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 20 / 47

Page 23: Making use of OpenStreetMap data with Python

Parsing dataWays

class SimpleHandler(sax.handler.ContentHandler ):

def __init__(self):...self.ways = {}

def startElement(self , name , attrs):if name == ’way’:

self.id = attrs[’id’]self.tags = {}self.geometry = []

elif name == ’nd’:self.geometry.append(attrs[’ref’])

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 20 / 47

Page 24: Making use of OpenStreetMap data with Python

Parsing dataWays

def reset(self):self.id = Noneself.geometry = Noneself.tags = None

def endElement(self , name):if name == ’way’:

self.way[self.id] = Way(self.id,self.geometry ,self.tags)

self.reset()

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 20 / 47

Page 25: Making use of OpenStreetMap data with Python

Parsing dataRelations

from shapely.geometry import MultiPolygon , MultiLineString , ...

ways = {...} # dict of ways , with ids as keys

class Relation(object ):

def __init__(self , id, members , tags):self.id = idself.tags = tagsif tags[’type’] == ’multipolygon ’:

outer = [ways[member[’ref’]]for member in membersif member[’role’] == ’outer’]

inner = [ways[member[’ref’]]for member in membersif member[’role’] == ’inner’]

self.geometry = MultiPolygon ([(outer , inner )])

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 21 / 47

Page 26: Making use of OpenStreetMap data with Python

Parsing dataRelations

The importing code is left as anexercise for the reader

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 21 / 47

Page 27: Making use of OpenStreetMap data with Python

For language zealots

Excuse me for not usingnamedtuples.

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 22 / 47

Page 28: Making use of OpenStreetMap data with Python

Parsing data: homework

The idea is simpleThe implementation can use ElementTree if youwork with small extracts of dataHave to stick to SAX when parsing huge extractsor the whole planet data

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 23 / 47

Page 29: Making use of OpenStreetMap data with Python

Existing solutions

Osmosisosm2pgsqlosm2mongo, osm2shp, etc.

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 24 / 47

Page 30: Making use of OpenStreetMap data with Python

3 Rendering

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 25 / 47

Page 31: Making use of OpenStreetMap data with Python

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 26 / 47

Page 32: Making use of OpenStreetMap data with Python

Principles

ScaleProjectionCartographyTypes of maps

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 27 / 47

Page 33: Making use of OpenStreetMap data with Python

Layers

Not exactly physical layersLayers of graphical representationDon’t render text in several layers

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 28 / 47

Page 34: Making use of OpenStreetMap data with Python

How to approach rendering

Split your data in layersMake projection configurableProvide general way to select data sourcesThink about cartographers

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 29 / 47

Page 35: Making use of OpenStreetMap data with Python

The magic of Mapnik

import mapnik

map = mapnik.Map(1000, 1000)mapnik.load_map(map , "style.xml")bbox = mapnik.Envelope(mapnik.Coord (-180.0, -90.0),

mapnik.Coord (180.0 , 90.0))map.zoom_to_box(bbox)mapnik.render_to_file(map , ’map.png’, ’png’)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 30 / 47

Page 36: Making use of OpenStreetMap data with Python

Magic?

Mapnik’s interface is straightforwardThe implementation is notComplexity is hidden in XML

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 31 / 47

Page 37: Making use of OpenStreetMap data with Python

Mapnik’s XML

<Style name="Simple"><Rule>

<PolygonSymbolizer ><CssParameter name="fill">#f2eff9</CssParameter >

</PolygonSymbolizer ><LineSymbolizer >

<CssParameter name="stroke">red</CssParameter ><CssParameter name="stroke -width">0.1</CssParameter >

</LineSymbolizer ></Rule>

</Style>

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 32 / 47

Page 38: Making use of OpenStreetMap data with Python

Mapnik’s XML

<Layer name="world" srs="+proj=latlong +datum=WGS84"><StyleName >My Style</StyleName ><Datasource >

<Parameter name="type">shape</Parameter ><Parameter name="file">world_borders</Parameter >

</Datasource ></Layer>

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 32 / 47

Page 39: Making use of OpenStreetMap data with Python

4 Searching

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 33 / 47

Page 40: Making use of OpenStreetMap data with Python

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 34 / 47

Page 41: Making use of OpenStreetMap data with Python

What’s that?

Codename geocodingSimilar to magnetsFast or correct – choose one

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 35 / 47

Page 42: Making use of OpenStreetMap data with Python

Why is it hard?

Fuzzy searchOrder mattersBut not alwaysOne place can have many namesOne name can correspond to many placesPeople don’t care about this at all!

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 36 / 47

Page 43: Making use of OpenStreetMap data with Python

Why is it hard?

I blame Google.

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 36 / 47

Page 44: Making use of OpenStreetMap data with Python

Attempt at implementation

Put restrictionsMake the request structuredOr at least assume orderAssume valid input from users

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 37 / 47

Page 45: Making use of OpenStreetMap data with Python

Attempt at implementationdef geocode (** query):

boundary = worldfor key in [’country ’, ’zip’, ’city’,

’street ’, ’housenumber ’]:try:

value = query[key]boundary = find(key , value , boundary)

except KeyError:continue

return boundary

def find(key , value , boundary ):for tags , geometry in data:

if geometry in boundary and\tags.get(key) == value:

return geometry

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 37 / 47

Page 46: Making use of OpenStreetMap data with Python

Fixing user input

Soundex/Metaphone/DoubleMetaphonePhonetic algorithmsWorks in 90% of the casesIf your language is EnglishDoesn’t work well for placenames

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 38 / 47

Page 47: Making use of OpenStreetMap data with Python

Fixing user input

from itertools import groupby

def soundex(word):table = {’b’: 1, ’f’: 1, ’p’: 1, ’v’: 1,

’c’: 2, ’g’: 2, ’j’: 2, ...}yield word [0]codes = (table[char]

for char in word [1:]if char in table)

for code in groupby(codes):yield code

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 38 / 47

Page 48: Making use of OpenStreetMap data with Python

Fixing user input

Edit distanceWorks for two wordsMost geocoding requests consist of several wordsScanning database for each pair distance isn’tfeasibleUnless you have it cached alreadyCheck out Peter Norvig’s “How to Write Spelling aCorrector” article

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 38 / 47

Page 49: Making use of OpenStreetMap data with Python

Fixing user input

N-gramsSubstrings of n items from the search stringEasier to index than edit distanceGives less false positives than phonetic algorithmTrigrams most commonly used

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 38 / 47

Page 50: Making use of OpenStreetMap data with Python

Fixing user input

from itertools import izip , islice , tee

def nwise(iterable , count =2):iterables = enumerate(tee(iterable , count))return izip (*[ islice(iterable , start , None)

for start , iterables in iterables ])

def trigrams(string ):string = ’’.join([’ ’, string , ’ ’]). lower()return nwise(string , 3)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 38 / 47

Page 51: Making use of OpenStreetMap data with Python

Making the search free-form

Normalize input: remove the, a, . . .Use existing free-form search solutionCombine ranks from different sources

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 39 / 47

Page 52: Making use of OpenStreetMap data with Python

Making the search free-form

from operator import itemgetterfrom collections import defaultdict

def freeform(string ):ranks = defaultdict(float)searchfuncs = [(phonetic , 0.3),

(levenshtein , 0.15),(trigrams , 0.55)]

for searchfunc , coef in searchfuncs:for match , rank in searchfunc(string ):

ranks[match] += rank * coefreturn max(ranks.iteritems(), key=itemgetter (1))

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 39 / 47

Page 53: Making use of OpenStreetMap data with Python

5 Routing

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 40 / 47

Page 54: Making use of OpenStreetMap data with Python

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 41 / 47

Page 55: Making use of OpenStreetMap data with Python

The problem

When introduced with routing problem, people thinkBuild graph, use Dijsktra, you’re done! (And they aremostly right)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 42 / 47

Page 56: Making use of OpenStreetMap data with Python

The problem

Not that simpleGraph is sparseGraph has to be updated oftenDijkstra algorithm is too generalA* is no better

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 42 / 47

Page 57: Making use of OpenStreetMap data with Python

The problem

Routing is not only a technical problemDifferent people expect different results for thesame inputRouting through cities is always a bad choice (evenif it’s projected to be faster)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 42 / 47

Page 58: Making use of OpenStreetMap data with Python

Building the graph

Adjacency matrix is not space-efficientThe graph representation has to very compactnetworkx and igraph are both pretty good for a start

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 43 / 47

Page 59: Making use of OpenStreetMap data with Python

Building the graph

from networkx import Graph , shortest_path

...

def build_graph(ways):graph = Graph()for way , tags in ways:

for segment in nwise(way.coords ):weight = length(segment) * coef(tags)graph.add_edge(segment [0], segment [1],

weight=weight)return graph

shortest_path(graph , source , dest)

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 43 / 47

Page 60: Making use of OpenStreetMap data with Python

Building the graph

There is no silver bulletNo matter how nice these libs are, importing evenEurope will require more than 20 GB of RAMSplitting data into country graphs is not enoughOur in-house C++ graph library requires 20GB ofmem for the whole world

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 43 / 47

Page 61: Making use of OpenStreetMap data with Python

Other solutions

PgRouting – easier to start with, couldn’t make itfast, harder to configureNeo4j – tried 2 years ago, proved to be lackingwhen presented with huge sparse graphsEat your own dogfood – if doing “serious business”,most probably the best solution. Half-wink.

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 44 / 47

Page 62: Making use of OpenStreetMap data with Python

Bored already?

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 45 / 47

Page 63: Making use of OpenStreetMap data with Python

Lighten up, I’m done

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 45 / 47

Page 64: Making use of OpenStreetMap data with Python

Highlights

Start using OpenStreetMap data – it’s easyTry building something simple – it’s coolTry building something cool – it’s simplePython is one of the best languages [for doing GIS]

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 46 / 47

Page 65: Making use of OpenStreetMap data with Python

Questions?

[email protected]: mishkovskyi.net/ep2011

Andrii V. Mishkovskyi () Using OpenStreetMap data with Python June 22, 2011 47 / 47