OpenStreetMap

Distributed Map rendering

Posted by Deelkar on 24 June 2008 in English.

After my rather pessimistic post last time, I want to say that I'm basically optimistic about the progress in OSM rendering efforts in general and tiles@home in special.

There are, however, fundamental differences between centralised map rendering and distributed map rendering. Some are obvious, others less so.

Let me explain:
The obvious difference is, well, one is centralised and runs on one or maybe a couple of central servers, that can, hopefully, render everything "on demand". Mapnik can do this. The main advantage is, that this is highly efficient, as only "interesting" parts of the world are rendered to exactly the level of detail needed. The process is also fast and scalable enough to suit our current needs. However this comes at the cost of a specialised dataset, that cannot be updated by diffs, so while the throughput and rendering speed are very good, the latency is very bad. (currently up to one week). Many consider this a major flaw, which is why projects like tiles@home were started. The advantage of the distributed method is, that it has (theoretically) a very low latency, and even in practise the "osmarender" layer generated by this project is generally up-to-date within hours of the corresponding edits.
The downside is that since the tiles have to be generated from live data, it doesn't make too much sense to request data from the api for every little tile, so we do it with tilesets, that is, the area of one z12 tile is downloaded from the api and then all tiles for that zoom up to zoom 17 in that area are generated, regardless wether or not anyone will ever look at them.
This is efficient in a "Save API resources" way, but not on the central server that has to manage the tiles in the end.
Of course, like with the centralised rendering you can try to split the load between multiple servers which would relieve things somewhat, but then there will be another bottleneck, namely the API providing the data. There has to be another data server besides the API, maybe kept up to date with the minutely diffs or some kind of replication mechanism just to serve the bulk requests from renderers.

So basically the two methods of rendering are complementing each other, and while it would be possible to remedy the shortcomings of either way it's not easily done. And even if, the possibility of 2 different renderers in existence can give a glimpse of what is possible with OSM data.

Discussion

Comment from Andy Allan on 25 June 2008 at 10:55

I don't think the osm2pgsql diff parsing will be that difficult, it just needs someone focussed on doing so. The base work (slim mode) has already been implemented - because osm2pgsql is lossy it needs a reference dataset to calculate the results of a diff.

At that point, practically every client running t@h could run their own mapnik-powered map all by themselves, and your concerns about not being able to make a map showing everything would become moot.

Log in to leave a comment