a fast, offline reverse geocoder in python
TRANSCRIPT
OUTLINE
OpenSignal
Motivation
The Library
Demo
Performance Results
Applications
Contributions from the Community
OPENSIGNAL
http://opensignal.com | http://wifimapper.com
Cellular data points: 41 billionWiFi data points: 50 billionSpeed tests: 93 million
MOTIVATION
Reverse geocode terabytes of data (~50M coordinates / day)Options:
Online web services (Google Maps, OpenStreetMap)RestrictiveSlow
Offline (PostGIS, Python libraries)ComplexSlow
THE LIBRARY
Improves on an existing library by Richard PenmanSupports Python 2 and 3Geocodes a lot more informationHigh Performance
Open Source (LGPL license)
Statistics: (since 27/03/2015)
Downloads: 2,649 Commits: 41Committers: 5Stars: 1,089Forks: 40
#notsohumblebrag
• Place name• Country Code (ISO-3166)• Admin region 1• Admin region 2• Coordinates
IMPLEMENTATION
Two modes:Mode 1: Single-processMode 2 (Default): Multi-process
Source of data: GeoNamesPlaces with a population > 1000 (Total = 144,859)
GPS coordinates of cities loaded into a K-D TreeNearest neighbour (NN) algorithmMode 1: cKDTree class in scipyMode 2: Parallelised cKDTree
Dependencies:numpyscipy
PARALLELISED K-D TREE
Uses the multiprocessing modulePros over threading:
Exploits multiple CPUs and coresNo GIL limitation
Cons over threading:Separate memory space => IPC or Shared Memory
Static SchedulingK-D Tree Settings:
Euclidean distance (Minkowski p-norm where p = 2)Distance upper bound: Inf
Refer multiprocessing tutorial by Sturla Molden, University of Oslo
APPLICATIONS (1/2)
• Top 20 regions in the UK where OpenSignal users run speed tests
Data from Sep-Dec 2014
APPLICATIONS (2/2)
• Speed test data points from the Greater London region
Data from Sep-Dec 2014Visualisation using Google Fusion Tables