My latest Europe import was quite eventful. First, I run out of space several
times during the import itself, at indexing time. The good thing is that, if you
manage to reclaim some space, and reading a little of
source code[1],
you can replay the missing queries by hand and stop cursing. To be fair,
osm2pgsql
currently uses a lot of space in slim+flat-nodes mode: three tables,
planet_osm_node
, planet_osm_way
and planet_osm_relation
; and one file, the
flat nodes one. Those are not deleted until the whole process has finished, but
they're actually not needed after the processing phase. I started working on
fixing that.
But that was not the most difficult part. The most difficult part was that I
forgot, somehow, to add a column to the
import.style
file.
Elevation, my own style,
renders different icons for different types of castles (and forts too), just like
the Historic Place map
of the Hiking and Bridle map[2]. So today
I sat down and tried to figure out how to reparse the OSM extract I used for the
import to add this info.
The first step is to add the column to the tables. But first, which tables should be impacted? Well, the line I should have added to the import style is this:
node,way castle_type text polygon
That says that this applies to nodes and ways. If the element is a way, polygon
will try to convert it to a polygon and put it in the planet_osm_polygon
table;
if it's a node, it ends in the planet_osm_point
table. So we just add the
column to those tables:
ALTER TABLE planet_osm_point ADD COLUMN castle_type text;ALTER TABLE planet_osm_polygon ADD COLUMN castle_type text;
Now how to process the extract? Enter
pyosmium
. It's a Python binding
for the osmium
library with a stream-like type of processing à la expat for
processing XML. The interface is quite simple: one subclasses
osmium.SimpleHandler
, defines the element type handlers (node()
, way()
and/or relation()
) and that's it! Here's the full code of the simple Python
script I did:
#! /usr/bin/python3import osmium
import psycopg2
conn= psycopg2.connect('dbname=gis')
cur= conn.cursor()classCastleTypes(osmium.SimpleHandler):defprocess(self, thing, table):if'castle_type'in thing.tags:try:
name= thing.tags['name']# osmium/boost do not raise a KeyError here!# SystemError: <Boost.Python.function object at 0x1329cd0> returned a result with an error setexcept(KeyError,SystemError):
name=''print(table, thing.id, name)
cur.execute('''UPDATE '''+table+''' SET castle_type =%s WHERE osm_id =%s''',(thing.tags['castle_type'], thing.id))defnode(self, n):
self.process(n,'planet_osm_point')defway(self, w):
self.process(w,'planet_osm_polygon')
relation= way # handle them the same way (*honk*)
ct=CastleTypes()
ct.apply_file('europe-latest.osm.pbf')
The only strange part of the API is that it doesn't seem to raise a KeyError
when the tag does not exist, but a SystemError
. I'll try to figure this out
later. Also interesting is the big amount of unnamed elements with this tag that
exist in the DB.
[1] I would love for GitHub to recognize something like https://github.com/openstreetmap/osm2pgsql/blob/master/table.cpp#table_t::stop and be directed to that method, because #Lxxx gets old pretty quick.
[2] I just noticed how much more complete those maps are. more ideas to use :)
openstreetmapgispython