OpenStreetMap

chriscf's Diary

Recent diary entries

Tainted imports

Posted by chriscf on 30 September 2011 in English.

We're approaching the third anniversary of my pointing out on the wiki (and IRC, IIRC) that the GADM boundaries were apparently encumbered such that we couldn't use them. Like this. Which sort of begs the question of why it's still in the database, or why people have indeed continued adding them (in Ecuador and across the Middle East).

So, what are we going to do about them?

South Sudan

Posted by chriscf on 23 September 2011 in English.

Much is being made of Google's addition of the South Sudan name and border to its maps. What seems to be missing from much of the coverage is that we got there on 9 July. That said, we are missing a point of dispute, in that the area around Abyei is claimed by both states - an internet to anyone who has the requisite data to mark said area on the map.

Coercion, again

Posted by chriscf on 16 June 2011 in English.

For a while, many of the objectors to the licence change have been making nonsense claims that the process has been sudden (though it's been 4 years and counting), secretive (though the mailing lists are open and archived), and undemocratic (though if you don't register to vote, you have no standing to complain). However, one has finally been confirmed.

Some have been claiming that coercion has been in play. I didn't believe it at first, but I have recently discovered shocking evidence that it has been happening. This is a case of someone attempting to coerce the community, and threatening to hold your data hostage until they get your way - and they're attempting to do it through back-room dealings at OSMF! This is disgusting, and it should be utterly rejected.

This is an interesting trace. Not least because it would involve walking or cycling directly through solid ground, and then through two walls.

Of course, MapDust has reports that are useless in other ways, such as this. Someone has failed to appreciate that without knowing the intended route, I have no idea whatsoever why this person was instructed to exit the motorway, or why they felt it was inappropriate.

That said, reports such as this should just be rejected automagically.

Per-changeset reliceinsing question

Posted by chriscf on 24 February 2011 in English.

There is a question doing the rounds regarding per-changeset relicensing, which asks (specifically) whether the ability to do this would allow people to agree to the new terms. This qualifier is more-or-less irrelevant. It would be a useful feature, but it is IMO nonsense to suggest that this would allow contributors to agree, mostly because it carries the inherent assumption that this is preventing them from doing so in the first place - not helped by the significant FUD being spread on this matter.

My comments to the survey:

This question is only relevant if having contributed incompatible data somehow debarred someone from agreeing to CT. It doesn't - any tainted data that couldn't be salvaged through negotiation with the supposed rights holder would have to be removed regardless of the wishes of the editor who contributed it.

This is by referece to clause 1(b) of CT 1.2.4:

(b) Please note that OSMF does not have to include Contents You contribute in the Project, and may remove Your contributions from the Project at any time. For example, if we suspect that any contributed data is incompatible, (in the sense that we could not continue to lawfully distribute it), with whichever licence or licences we are then using (see sections 3 and 4), then we may delete that data.

My reading of the spirit of the new clause 1 is that you agree to contribute "clean" (i.e. compatible) data, and that 1(b) would allow a shade for "tainted" (i.e. incompatible, or thence derived) data contributed in the past. The appropriate remedies would be either to terminate the agreement, or to sue - and to do either to a contributor who is willing to contribute original data is counterproductive.

There are experts who seem to know how to handle data which needs to go. Let them deal with it. Everyone else needs to stop worrying and get on with their lives.

Let me repeat again: having contributed tainted data in the past does not prevent you from agreeing and contributing clean data in the meantime. Anyone that tries to tell you that a little tracing or a samll import means you will be effectively banned on April 1st should be struck firmly with a clue-by-four. If in doubt, the route of "agree for future contributions" can be achieved for the time being by creating a new account. Earlier readers will remember this handy pull quote:

"Anyone that tells you that you can't create a new account to agree to CT is an idiot." -- chriscf

For anyone waiting for the amended CT, LWG minutes suggest that they are ready to go, but waiting to be pushed to live.

The moral of this story? This exercise is not as big a deal as some people seem to want it to be.

Crustum in caeli: buses

Posted by chriscf on 10 January 2011 in English.

[Mostly inspired to say this in public after Harry's recent entry, and some of the comments that followed.]

The poor Latin aside, one of my side interests is public transport. Having ridden many of the area's bus routes, and induced paranoia in drivers on others (mostly through tailing them in the car), I have built up a sizable amount of route data for this part of the world I call home. Indeed, a look around the last edition of ⍰PNV-Karte, and a good deal of that red is my fault. Therefore it seems quite a shame to have seen said ¿PNV-Karte fall into the trouble it has - particularly given it was a very useful tool for mappers to check their data against once the updates had gone through.

I have a small render (based on this) set up on a VM at home to simply draw red lines on transparent tiles for pretty much the area you see there, and little else. (I don't have the space or the processing power to do much more - it can do this basic download-filter-render cycle in around 90 minutes).

There are some hurdles I'd need to overcome to push this further, and some things I'd like to explore once over the hurdles.

Hurdle 1: resource. This very plainly isn't going to scale to anywhere beyond the slightly-less-than-a-square-degree I currently have on my box at home. I can't get past transparency because I suspect the VM won't handle the coastline, etc. (which I dropped from the local render - no great loss if they're transparent). Such a thing is also of no use to anyone if it's buried at home without a permanent connection.

Hurdle 2: stylesheets. Mapnik stylesheets appear to be a pain in the bum to maintain. Indeed, a look at the 270kB monstrosity that powers the main Slippy Map would leave me surprised if the policy was anything other than "If it's working, leave it the fuck alone". Generating the simple plain red semi-transparent lines (which look kind of awesome when they've built up with lots of routes, but could be better) was nice and simple with the use of Spreadnik, but anything more complex is going to be awkward, but not necessarily difficult.

With the hurdles dealt with:

1. Get lines and numbers on the map. Maybe use shields in certain views. For local networks, maybe use dynamic colouring - the base colour of a line (or better still, the background colour for the shield) comes from the color attribute on the relation.

2. Go well beyond transparency. Produce maps at tighter zooms that look like some city plans - grey background, white roads, colour fill for streets used by buses. Turn one a Mapnik feature to my advantage by maybe using different styles for z11-14 and z15-18 (Or Something[TM]).

Anyone up for some aerial baking?

Names and numbers

Posted by chriscf on 9 January 2011 in English.

I'd be interested in finding out who was responsible for assigning names and numbers to properties in Swansea Marina, and having them prosecuted for crimes against humanity. I am sure in times to come archaeologists will uncover the bodies of people who lost their lives trying to figure out where a particular flat on Arethusa Quay is supposed to be.

Take a look here and see if you can guess (without actually spending 45 minutes on the scene figuring it all out) where in that square in the middle (north of Trawler Road) the following addresses are supposed to be:

1-55 Trawler Road
1-71 Abernethy Quay
1-21 Abernethy Square
1-58 St. Nicholas Square

Answers on a postcard please. I will reveal the answers just as soon as I can find the patience to punch in the details.

Drawing geodesics vs. loxodromes

Posted by chriscf on 6 September 2010 in English.

When someone specifies a "line" between two points on the surface of the earth, they will usually be referring to one of two types:

* loxodromic - constant bearing, and shown as a straight line in Mercator

* geodesic - conceptually closer to the "straight line"

Drawing a way representing a loxodrome is easy - plot two points, draw a way between them, project it in Mercator, and add intermediate points on the resulting straight line if necessary (i.e. if ways are not already assumed to be loxodromes).

Here comes the problem - I need to draw a line which is defined as a series of geodesics between points, rather than loxodromes. How on earth do I do it? (No pun intended)

OS OpenData suspected CT-compliant

Posted by chriscf on 28 August 2010 in English.

First, the important part: Having contributed data derived from OS OpenData, I intend on agreeing to CT and relicensing my previous contributions before the end of September.

If anyone is in any doubt over the situation with OS and CT, create a new account and avoid deriving from OS data in the meantime. This will allow you to continue contributing to OSM until the matter is settled. Anyone that tells you otherwise is an idiot, and you can quote me on that. In fact, here's a pre-formatted pull-quote you can use:

"Anyone that tells you that you can't create a new account to agree to CT is an idiot." - chriscf

Here comes the complicated bit that you have to read from start to finish. If you're not interested, you can probably stop here. Otherwise, you will need to read all the way to the end.


There are open questions as to whether third-party sources are good for the new Contributor Terms. We've already had NearMap come forward and say they're not happy with them. This means we will likely lose that data. This is probably the fault of whatever idiot thought it was remotely a good idea to add third-party CC-BY-SA data while the licence change discussion was well under way (this has been in train since around 2007 or so, if not earlier).

Let us turn to clause 2: "You hereby grant to OSMF a worldwide, royalty-free, non-exclusive, perpetual, irrevocable licence to do any act that is restricted by copyright over anything within the Contents, whether in the original medium or any other." According to section 16 Copyrights, Designs and Patents Act, the "acts restricted by copyright" are:

(a) to copy the work (see section 17);
(b) to issue copies of the work to the public (see section 18);
(ba) to rent or lend the work to the public (see section 18A);
(c) to perform, show or play the work in public (see section 19);
(d) to communicate the work to the public (see section 20);
(e) to make an adaptation of the work or do any of the above in relation to an adaptation (see section 21);

The OS OpenData licence specifically makes the following statement:

This is a worldwide, royalty-free, perpetual, non-exclusive licence from the provider of the Data (the “Data Provider”) to use it subject to the conditions below.

This only leaves the word "irrevocable", which is dealt with by this paragraph at the end:

The Data Provider may amend the terms of this licence or make the Data available under a different licence. However, these terms will continue to apply to data you already license from the Data Provider.

Then there's the lovely bit about "any act that is restricted by copyright". We can interpret this in terms of section 16. The OS OpenData licence comes with the following grants:

You are free to:
* copy, distribute and transmit the Data;
* adapt the Data;
* exploit the Data commercially whether by sub-licensing it, combining it with other data, or by including it in your own product or application.

The first point covers 16(a), 16(b), 16(c) and 16(e); the second point covers 16(d); the third point joins the dots to cover 16(ba). Hence, the licence covers "any act that is restricted by copyright".

As there is no share-alike provision which imposes upon your own licensing options, it would appear that we are granted a licence of the type in clause 2, and nothing in it restricts our users' ability to grant such a licence themselves. As for clause 1, this licence has been authored by the OS, exists in fixed form and is prominently announced from the OS OpenData site - therefore there can be no doubt that it is "explicit permission". Any person that thinks otherwise should be very careful never to find themselves in the presence of two doctors.

There is a requirement for attribution, and any object derived from the various OS OpenData products should already be tagged as such. If anyone has been importing stuff from OS OpenData and not tagging the source, they should be taken out and shot - this particular requirement has been right there on the mailing lists and the wiki since day one (since around half-past three on day one, in fact).

Some may say "surely that licence includes removing the attribution". It doesn't. That is not an "act that is restricted by copyright". That is a matter of the contractual relationship between the parties. It may be true that a person receiving the data may end up not attributing the OS. That is their problem, not ours. It has been suggested that it is sufficient for someone to point back to OSM, using our copyright info page to properly attribute the OS over the whole dataset (in addition to the individual items within it). We cannot write anything into the terms of the contract that tells people that they must adhere to the licence, or the consequences for failure, as there is a long history of case law (stretching back to 1886) which maintains that a contract cannot contain deterrent terms.

Readers should alo note that I have assumed that the data provided by the OS is covered by copyright in the first place - there seems to be an increasing body of legal opinion in common law jurisdictions that the vector data is not, and neither is the vector data derived from raster data.

Based on that, my own interpretation of the scenario as a pragmatic layman with some experience of the legal system (note: not a lawyer) is that data derived from the OS OpenData release is on balance of probabilities acceptable under CT.

I therefore intend on agreeing to CT and relicensing my previous contributions before the end of September.

If anyone is in any doubt, create a new account and avoid deriving from OS data in the meantime. This will allow you to continue contributing to OSM until the matter is settled.

Copyright on marine boundaries

Posted by chriscf on 16 August 2010 in English.

The boundaries we have for the UK at sea are horrenedously inaccurate. There are an awful lot of lines that are seemingly put in there to give closed polygons (IMO on par with tagging for the renderer - distorting the data for the convenience of some tool). Believe it or not, it's perfectly acceptable to leave them open if we don't know where one side actually is.

Take a look in the Bristol Channel, and you'll find a border - that part is in reality territorial waters. Worse still, look between the Scottish Highlands and the Outer Hebrides and you'll find a border running through what are actually internal waters, since the baseline is drawn around it.

The actual demarcation points for the baseline, including the boundaries between internal zones attributable to the four nations, are defined in legislation, where they are simply listed as lat/lon co-ordinates. I'm not sure whether these are necessarily off-limits - I can find nothing which makes French legislation public domain (as is the case in e.g. the USA) and yet we have these: http://www.openstreetmap.org/browse/node/479324518

Would anyone think that a list of lat/lon pairs listed in an Order-in-Council is subject to copyright, or would they amount to "bare fact"?

(Of course, the above is nothing next to the heinous crime of drawing the Irish boundaries *through* Lough Foyle and Carlingford Lough.)