OpenStreetMap

mmd's Diary Comments

Diary Comments added by mmd

Post When Comment
Road Watcher - a simple Python bot to monitor road class changes

You could probably achieve something similar by using https://gist.github.com/mmd-osm/bca4da8cd9afe8a071c45fef38e9afa6 -

The query only returns ways which have changed since a cut off timestamp (newer:…), and then returns the same ways again at the cut of timestamp itself.

As a result, you don’t need to store the previous query result anymore but only the last processing timestamp.

Also, the query result should hopefully be much smaller, like a few hundred ways instead of more than 200k.

There’s one caveat: the way the retro statement is used here might change in the future (see https://wiki.openstreetmap.org/wiki/Overpass_API/Overpass_QL#The_block_statement_retro )

OpenStreetMap NextGen Development Diary #2

Thank you for the update. As before, I did some local testing here on my box, to better understand how user interactions look like (screenshots are not that good at that).

I think the Profile picture part looks cleaner and easier to understand, and I can see some benefits in splitting up the current “My Settings” page in smaller pieces. I’ve noticed some small details like restricting the permitted file types to images, which helps to improve user experience at essentially zero development effort.

Of course, there are still some rough edges, such as this interesting Internal Server Error: “PIL.Image.DecompressionBombError: Image size (204615540 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.”, but that’s ok at this point in time.

I also managed to create some notes via the API 0.6 endpoint, but not on the map itself. Here’s what I could capture in the browser console:

“POST /api/0.6/notes HTTP/1.1” 422 Unprocessable Entity –> input is null

It looks like the payload is lacking some relevant details here. I’m pretty sure I’m missing some pipeline step here. Any sort of documentation would be much appreciated.

-----------------------------11619737919638056413629656613
Content-Disposition: form-data; name="lon"


-----------------------------11619737919638056413629656613
Content-Disposition: form-data; name="lat"


-----------------------------11619737919638056413629656613--

Also the single click Debug Login didn’t work here, it just doesn’t do anything, not even a message on the browser console. Manually logging in with “user1” and password “openstreetmap” did the trick, though.

I’ve also noticed Cython-build being a bit unstable at the moment, but didn’t spend time to investigate it. I will try this again in a few weeks maybe.

By the way, setting up users on Rails and registering apps could be fully scripted. Registering a new OAuth2 application is essentially a one-liner:

Doorkeeper::Application.create(name: "Local iD", owner_type: "User", owner_id: User.find_by(:display_name => "mmd2").id, redirect_uri: "http://localhost:3000", scopes: ["read_prefs write_prefs write_api read_gpx write_gpx write_notes"])

Results could then be used to automatically update settings.local.yml. My guess is that all preparation steps, including user creation and app registration could be done in about 20 LOC without any hidden magic involved. Nobody bothered enough to do it so far, still it’s fairly easy to do.

OpenStreetMap NextGen Benchmark 1 of 4: Static and unauthenticated requests

the benchmark is mostly measuring docker overhead, not the ruby code in production.

I cannot confirm this. When running the Rails server in development mode, you will notice a similar increased runtime even without Docker in place. This is expected behavior as mentioned before. Rails developer mode is not suitable for performance testing.

OpenStreetMap NextGen Benchmark 1 of 4: Static and unauthenticated requests

Isn’t production also using docker images?

The openstreetmap-website project has no dependency on Docker, and it’s also not used in production. Someone contributed a Dockerfile a while ago with the idea to facilitate local development. It is completely optional, meaning you can easily set up your local development (or production environment) without Docker. By the way, I don’t use Docker for my local Rails set up either, like a good part of the other contributors.

The Rails app on osm.org production is managed through the Chef repository, which you can find here: https://github.com/openstreetmap/chef/tree/master/cookbooks/web

It includes all the steps to set up a production environment. I still find it easy enough to read and go through. The number of configuration settings may seem a bit daunting at first, so plan some time and don’t be afraid to ask questions if something is not clear to you. I do this all the time.

Regarding performance testing, I would assume that RAILS_ENV=production would be a good starting point for measurement with puma (the default webserver). I also compared puma with Phusion Passenger and found runtime differences to be negligible. So let’s try to keep it simple and check first that you local rails server is using the right settings.

OpenStreetMap Website Vulnerability Report

Thanks a lot for looking into this and disclosing the issues in a responsible manner. A few comments on some of the remaining open ones:

  • Application Preference Leakage: we’ve indeed been discussing to introduce partitioning for application preferences back in https://github.com/openstreetmap/openstreetmap-website/issues/2326 . Unforntunately, the partitioning idea hasn’t been implemented as of now. The issues regarding information leakage, or even manipulation were already mentioned back then.
  • Notes Search Query Denial of Service: that’s also a well known issue. We even have blacklisted some specific search strings: https://github.com/openstreetmap/chef/commit/e1bc94ff7a1970c8bc669a034ffbf7d0165e510a - it would be good to address the underlying performance issue.
  • Plain-Text Authentication Token Storage: we’ve been also discussing this topic back in October last year, in particular to restrict access to external apps. IIRC, Doorkeeper had some issues when using both hashing and token reuse, that’s probably why this issue is still open.

By the way, I’m a bit surprised not to see any Javascript of CSRF issues (we’ve been fixing some of these in the past). Is this still to come?

OpenStreetMap NextGen Benchmark 1 of 4: Static and unauthenticated requests

First of all, I find it difficult to make some sense of these figures, since measurements were done on different boxes with fairly different hardware specs. For “Ruby official” and “Ruby test” one can refer to https://hardware.openstreetmap.org/ : spike-0[6-8] are the production osm.org frontend servers, faffy is the development/test server, all of which are running openstreetmap-website as a Rails application. “Python” measurements were presumably performed on a private box with unknown hardware specs.

I initially planned to run the benchmark solely on my local machine, following the official Docker instructions.

Dockerfiles are mainly aimed at local development, and are not suitable for performance measurements. They’re starting up a Rails application in development mode, rather than production mode. It is expected that runtimes in development mode are much higher. I would assume that “Ruby (local)” runtimes were collected in this mode.

As an outlook: I expect to see some nice speedups for these types of micro-benchmarks when using Ruby 3.3.0 with YJIT enabled (currently not yet enabled on osm.org servers). To get an idea about the relative speed difference, I did a very quick test run on my laptop:

Without YJIT:

python benchmark.py 
Benchmarking http://127.0.0.1:3000/copyright...
Min: 0.00737s
Median: 0.00767s

With YJIT enabled:

python benchmark.py 
Benchmarking http://127.0.0.1:3000/copyright...
Min: 0.00431s
Median: 0.00468s

Local server started as: RUBY_YJIT_ENABLE="1" RAILS_ENV=production bundle exec rails s

=> Booting Puma
=> Rails 7.1.3.2 application starting in production 
=> Run `bin/rails server --help` for more startup options
Puma starting in single mode...
* Puma version: 5.6.8 (ruby 3.3.0-p0) ("Birdie's Version")
*  Min threads: 5
*  Max threads: 5
*  Environment: production
*          PID: 52146
* Listening on http://0.0.0.0:3000
OpenStreetMap NextGen Takes Shape! (screenshots)

can it be run / tested in parallel with current production code (this is huge issue, as Ops want the code to have proven reliability before even considering to look at it. It is also a reason why keeping with same PostgreSQL schema allowed this project to proceed - if NorthCrab were to insist on original noSQL ideas, that would IMHO be insta-killer for the project adoption)

I’d say at this point in time, it’s basically not possible to run the Python code against the same production APIDB instance, like we do for CGImap, or osmdbt for minutely diffs, or the weekly planet dump.

To be fair, as far as i can tell, the idea was always to migrate the current db to the new schema, and run it on a separate database instance. I don’t recall any post or announcement, which claimed to be compatible with the existing schema.

So what has changed?

The 16 database tables currently used for nodes, ways and relations have been replaced by a single “element” table in which each single object version is stored. So one table for everything.

As a result, the code is written in such a way that some concepts are now more generic (e.g. tags behave in the same way regardless of their object type). It remains to be seen how this will affect the performance of our 11 TB database.

Way nodes have been replaced by the member concept already known from relations. That means, a way looks a bit like a relation where every member is simply a node. I recall seeing such ideas on the API 0.7 proposal page, where someone suggested that “everything is a relation”. It’s probably the first time, this idea has made it in any sort of real implementation.

Tags are stored in JSONB fields, and for node coordinates a new dependency on Postgis has been introduced. As far as I can tell, this is used for nodes only, to enable polygon based node queries.

There are many more changes in place which I will skip for now, since I cannot comment on them without some more in depth testing.

Reference: https://github.com/Zaczero/openstreetmap-ng/blob/main/app/models/db/element.py#L23-L34

OpenStreetMap NextGen Takes Shape! (screenshots)

I tried to run the code locally, and took a few screenshots like this one:

image

It’s still a bit bumpy to get this up and running w/o any sort of documentation, but hey, that was kind of expected at this point in time. Nix Shell was a huge pain on my system with issues due to incompatible GLIBC versions and random /nix libraries being injected to LD_LIBRARY_PATH that made some programs fails, while others weren’t working without it.

Obviously, I was more interested to see a bit of API stuff in action. For some reason http://localhost:3000/api/0.6/node/1 would return this nice error message. This looks like a topic for another time.

{"detail":[{"type":"is_instance_of","loc":["path","type"],"msg":"Input should be an instance of StrEnum","input":"node","ctx":{"class":"StrEnum"},"url":"https://errors.pydantic.dev/2.6/v/is_instance_of"}]}
Small Towns in Europe

That took about 3 days 😅

I believe one reason for this long runtime could be your pre-filtering on certain areas (not exactly sure if this how you’ve done it). When using https://overpass-turbo.eu/s/1GQm to analyze all place nodes on a global scale, the runtime should be less than 1.5 hours. It returns about 401k place nodes that would need some additional filtering by location as a post-processing step.

If you like to try this out, I have uploaded the query result here: https://dev.overpass-api.de/misc/test_buildings.xml.gz (file size: 21M)

A minute of facts about the duration of changesets

5 years ago we’ve already discussed to add an optional “close_changeset=true” attribute to the osmChange header. This would, as the name says, close the changeset as part of the upload, without the need to send an additional changeset close message. Unlike the proposed API 0.7 changes, it wouldn’t introduce an incompatible change, since it’s an optional attribute only.

Link: https://github.com/openstreetmap/openstreetmap-website/issues/2201

OAuthtung!

Yes, that’s just a normal OAuth2 Bearer Token, which doesn’t expire, like all other OAuth 2 tokens at the moment. The actual generation happens here: https://github.com/openstreetmap/openstreetmap-website/blob/master/app/models/user.rb#L379-L387

Doorkeeper…find_or_create_for is the relevant bit here to trigger the generation on the backend for a given application/user/list of scopes (assuming the token hasn’t been created yet, otherwise the existing token is retrieved).

OAuthtung!

FWIW: The “OpenStreetMap Web Site” OAuth2 application is also officially documented here: https://github.com/openstreetmap/openstreetmap-website/blob/master/CONFIGURE.md#oauth-consumer-keys -> To allow Notes and changeset discussions to work, follow a similar process, this time registering an OAuth 2 application for the web site […] Check boxes for the following Permissions ‘Modify the map’ and ‘Modify notes’.

OAuthtung!

It’s kind of funny that you’ve went the extra mile and blurred the OAuth token on the webpage, then pasting it in plain text and clearly visible in your terminal window. I hope you’ve revoked that token in the meantime ;)

OpenStreetMap Service Availability (2023-12-20 - 2024-01-20)

Yearly database re-indexing was running on the weekend of 01-14, with periods of fairly high load on the database server: https://prometheus.openstreetmap.org/d/Ea3IUVtMz/host-overview?orgId=1&var-instance=snap-01&from=1705130975266&to=1705276768520

This might have impacted some queries to take longer than usual, or even time out.

By the way, the CGImap link points to an outdated mirror. It should be https://github.com/zerebubuth/openstreetmap-cgimap instead.

Future deprecation of HTTP Basic Auth and OAuth 1.0a

So yes, you can do all this in a few lines of shell script, by using only curl and jq, without any external libs, local HTTP server, or anything: https://gist.github.com/mmd-osm/b61956bb4b92e9b37488189379b380c9

Before trying this out, be sure to sign up on the dev instance https://master.apis.dev.openstreetmap.org (you already knew this).

Bonus points for storing the access token in a local file, so you don’t need to go through the osm.org authorization each time you’re running the script. I was too lazy to implement that.

If you’re also too lazy like me, you can also use the access token, and treat it as some kind of Personal Access Token. Line 14 shows you how to use the access token to call an API endpoint.

Disclaimer: This is only meant for personal scripts and local testing. Also, please register you own app and replace client id and client secret with your own values. Use urn:ietf:wg:oauth:2.0:oob as redirect URL.

OpenStreetMap Service Availability (2023-11-20 - 2023-12-20)

12-03 issue was handled in this operations ticket: https://github.com/openstreetmap/operations/issues/1008

IIRC a well known company did some fairly extensive web page and API scraping over the weekend and was blocked subsequently. (Mentioned in another operations issue)

Leveraging PostGIS to Write Your FlatGeobuf Files

Spaces in code blocks work just fine, see https://kramdown.gettalong.org/syntax.html#code-blocks

Example:

def what?
  42
end
Multiple user accounts in JOSM

Switching accounts inside JOSM isn’t exactly a new idea: https://josm.openstreetmap.de/ticket/2710

Multiple user accounts in JOSM

Why don’t you use josm.home? Seems much easier to me to manage multiple profiles/users… https://community.openstreetmap.org/t/mehrere-josm-profile/74386

Future deprecation of HTTP Basic Auth and OAuth 1.0a

My target timeline for the C++ part of the API is no later than Q1/2024, see https://github.com/zerebubuth/openstreetmap-cgimap/issues/286

I don’t know what OWG will eventually come up with. Also sysadmins have the final say on what to deploy at which point in time.