foss4g in the cloud: using open source to build cloud based spatial infrastructure
TRANSCRIPT
![Page 2: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/2.jpg)
Agenda • Disclaimers • Goals/MoLves • The historical path to ‘Cloud CompuLng’ • ‘DefiniLon’ of cloud compuLng • FOSS4G in Cloud Use cases • AWS: Components and Services • Building for the cloud
– Architectural paUerns for Cloud Services – Cultural changes – Processes changes – Things to remember
• Common FOSS4G tasks in AWS – ImporLng OSM data into POSTGIS – Mod_Lle/Mapnik – GWC/Geoserver
• QuesLons?
![Page 3: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/3.jpg)
Disclaimers
• The work presented was funded personally and done during my vacaLon. All opinions are my own and not my employer.
• I am not affiliated with AWS in any other way than being a customer, I choose them when that choice makes sense and would use others where applicable.
• This is sLll Work in progress. YMMV
![Page 4: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/4.jpg)
Goals/MoLves
• Goals – We will learn or validate some ideas. – Get some feedback on what to do next. – Help save someone Lme/money/frustraLon – Raise awareness about some risks.
• MoLves – The new disrupLon is in data and services around it, we(Open Source people) should not miss out on that and I believe I can help.
![Page 5: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/5.jpg)
Cloud Computing
Hardware Changes
Virtualization Mobile Computing
Path to Cloud Computing
MultiScreen
Tablets
KVM/Xen
Solaris Zones
VMWare/Parallels
Storage/Network Virtualization
I/O Offloading
NPT/EPT
Multicore Support
Smart Phones
![Page 6: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/6.jpg)
Cloud CompuLng definiLon (IMHO)
• Cloud compuLng is a compuLng paradigm composed of abstracLons , a set of primiLves and a set of interfaces and tools to drive those abstracLons and primiLves. The abstracLons and primiLves need not be new in themselves, but their combinaLon and impact is what create ‘The Cloud’ culture.
![Page 7: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/7.jpg)
Compute Storage Network
PrimiLves
AbstracLons FoundaLon
Image Volumes Snapshots Autoscale
Tools APIs Config Management
![Page 8: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/8.jpg)
Example “High level” Architecture OpenStack
![Page 9: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/9.jpg)
In reality, it sorta looks like this
![Page 10: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/10.jpg)
AWS as a Public Cloud
![Page 11: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/11.jpg)
FOSS4G Use Cases
• Disaster Recover/Backup • StaLc, Logic-‐free, web publishing • Online FOSS4G as a Service • Data transformaLon jobs • Content CuraLon and Batch processes
![Page 12: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/12.jpg)
Example FOSS4G AWS Use Case StaLc publishing blueprint
![Page 13: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/13.jpg)
How to Build your Cloud Infrastructure
![Page 14: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/14.jpg)
Architectural PaUerns
• The Cookie CuUer/Soloist. • The Centrist. • The Replicator. • The Masters of Colonies.
![Page 15: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/15.jpg)
CAP: Cookie CuUer
![Page 16: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/16.jpg)
The Cookie CuUer/Soloist
• Pros: – Simple. – Scales Horizontally w/load. – Localized failure impact.
• Cons: – Poor support for write-‐oriented services. – Coarse grained scalability. – Node capacity has verLcal scalability issues.
![Page 17: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/17.jpg)
CAP – The Centrist
![Page 18: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/18.jpg)
The Centrist
• Pros: – Scales at components level. – Moderate complexity up to middle range load. – Faster/Easier fault isolaLon/detecLon. – Data stores Master/Slave is a well studied concept.
• Cons: – Central data store becomes more criLcal/boUleneck. – MulL-‐region deployments suffer from latency. – VerLcal scaling characterisLcs pronounced on the Data store.
![Page 19: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/19.jpg)
CAP – The Replicator
![Page 20: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/20.jpg)
The Replicator
• Pros: – Scales at components level. – Improved read performance. – BeUer Disaster Recovery. – Well suited for mulL regions deployments.
• Cons: – Writes are sLll central. – Added complexity. – Increased bandwidth requirements.
![Page 21: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/21.jpg)
Masters of Colonies
![Page 22: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/22.jpg)
CAP – Master of Colonies
• Pros: – Improved write performance. – Decompose large data sets into smaller ones. – Faster data iteraLons. – Good disaster recovery strategy.
• Cons: – Complex! – Weak/Varying support by various data stores. – High maintenance overhead
![Page 23: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/23.jpg)
Cultural Changes
• Get stakeholders buy-‐in early. • Build a full ownership culture. • Adopt an agile approach. • Encourage prototyping and experimentaLon. • AutomaLon as a way of life.
![Page 24: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/24.jpg)
Processes Changes • Somware Architecture:
– Know the floor, and the ceiling. – Be as stateless as possible. – Graceful failure response. – Good Logging as a way of life.
• Release Engineering – The VM as an arLfact – AutomaLon – Versioning – Snapshot
• AutomaLon: – ConfiguraLon management – OrchestraLon – Auto-‐scaling
![Page 25: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/25.jpg)
Things to remember
• Review any legal implicaLons. • Use the cloud primiLves. • Pay aUenLon to security: Security groups, Encrypted data at rest, etc.
• Cleanup old stuff. • Things fail: don’t fight it, just handle it. • You will not get it right the first Lme but things should look good on 3rd iteraLon.(Read the mythical man month)
![Page 26: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/26.jpg)
FOSS4G in AWS Performance/Architecture EvaluaLon • Tools used: – Siege – Sar – Oprofile – R/AWK/Python/Ruby
• Postgresql queries log. • Test client -‐> Target server as separate nodes.
![Page 27: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/27.jpg)
OSM Data into AWS • Setup 1
– M1.Large ( 2 Cores) – Standard EBS – EU-‐West region
• Setup 2 – M1.Large – Provisioned EBS : 8000 IOPS – EU-‐West region
• Setup 3 – Hi.4xlarge – SSD drive – EU-‐West region
• Setup 4 – M2.2xlarge – EU-‐West – Ephemeral drives
![Page 28: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/28.jpg)
ImporLng OSM data into AWS TesLng the water
![Page 29: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/29.jpg)
ImporLng OSM data into AWS TesLng the water some more
![Page 30: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/30.jpg)
Enough Water TesLng ImporLng Planet to SSD
• Guess how long it took to finish
![Page 31: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/31.jpg)
ImporLng Planet into AWS Using SSD
• It only took 35 hours! • Disk uLlizaLon: ~250Gb • Guess what was the first thing I did when it finished?
![Page 32: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/32.jpg)
ImporLng Planet into AWS
• I made a copy of course J • Create a RAID 0 set • Create LVM on top of RAID 0 • Kick off data copy • Guess how long it took
![Page 33: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/33.jpg)
ImporLng Planet into AWS
• It only took 2.5 hours.
![Page 34: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/34.jpg)
Data Import in AWS OSM full planet
![Page 35: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/35.jpg)
Profiling OSM2PGSQL
• Data sets used • Links/Ways/nodes of each set • Time
![Page 36: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/36.jpg)
Data import notes
• Create the DB on SSD and clone to EBS: – Use case: quickly import the data but make it persistent.
– Full planet volume takes 2-‐2.5 hours. • Create Provisioned EBS and clone to SSD: – Use case: Need very fast runLme access – Full planet volume takes 5.4 hours
• Can we get OSM primiLves summary per dump and full planet as part of the pbf?
![Page 37: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/37.jpg)
Data Import in AWS Lessons learned
• It is not only the disk. • Risk on mulLple levels – Dev teams can’t possibly be tesLng to their full potenLal(in the data context).
– Evident in outdated/incorrect documentaLon for bootstraping
![Page 38: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/38.jpg)
Rendering – ModLle/mapnik • Apache module + a unix daemon. • Apache module is process model, Renderd is mulLthreaded.
• Apache module sends a command to renderd over a unix socket.
• The renderer will fetch the data and writes it out. • Non cached data will: – Fail on first aUempt(return 404) – Pass on second aUempt(~600 msec)
• Cached data is served < 10 msec • Very SQL chaUery
![Page 39: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/39.jpg)
Renderd Threads Profiling
![Page 40: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/40.jpg)
Renderd Profiling
![Page 41: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/41.jpg)
Renderd Profiling
![Page 42: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/42.jpg)
Renderd Profiling
![Page 43: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/43.jpg)
Renderd Profiling
![Page 44: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/44.jpg)
Rendering – GeoServer/GWC
• Single layer, ZL 15, RAM Disk : 100 Lles/sec • TruncaLon is very slow. Please version your published layers.
• Standalone GWC offers much beUer scalability model
• Possible race condiLons in threads wriLng Lles.
• Didn’t hit the getAlphaTile() issue.
![Page 45: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/45.jpg)
GWC/Geoserver in AWS Example deployment
![Page 46: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/46.jpg)
Cost?
• Screenshot of my account acLvity
![Page 47: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/47.jpg)
Released arLfacts Snapshots of OSM data in flat PGSQL • 2 drives : – snap-‐f9affde6 – snap-‐ffaffde0
• To use: – Create a volume based on the snapshot – Mdadm acLvate ( raid0 , 2 drives) – Pvscan,vgscan,vgchange,lvscan – Installing mdadm and rebooLng should work on most machines to do this for you automagically.
– Mount on the volume on your PGDATA path
![Page 48: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/48.jpg)
Backlog
• Geocoding tesLng with Twofish and GISGraphy
• OSRM profiling • SuggesLons?
![Page 49: FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure](https://reader030.vdocuments.site/reader030/viewer/2022032615/55a2821e1a28abd1058b4656/html5/thumbnails/49.jpg)
Many thanks to
• Geofabrik for compiling all those sets/formats. • FOSS4G2013 for this opportunity • And THANK YOU