Disclaimer: I am not an OSM website developer. All information here was obtained by looking at the OSM GitHub repository and poking at the OSM website.
There’s been some controversy recently over the contents of the OpenStreetMap robots.txt file. I think it might be informative to look at what the file actually does.
Allow: /user/
This does nothing. “Allow” lines in a robots.txt file permit the crawling of URLs that would otherwise be denied, but there’s nothing in the file that would deny the /user hierarchy.
Disallow: /traces/tag/
Disallow: /traces/page/
These are various alternate ways of searching the GPS traces that have been uploaded on the site. The main trace listing is still accessible.
Disallow: /trace/
This is the API endpoint for accessing GPS traces. It is not intended to be displayed in a web browser, and contains nothing useful for a search engine.
Disallow: /api/
This is the API endpoint for editing the map. It is not intended to be displayed in a web browser, and contains nothing useful for a search engine.
Disallow: /edit
This is the URL for the in-browser editor. Everything under this URL is behind a login barrier, and it contains nothing useful for a search engine.
Disallow: /message
This is the URL hierarchy for the on-site PM system. Everything under this URL is behind a login barrier, and it contains nothing useful for a search engine.
Disallow: /login
This is the above-mentioned login barrier. It contains nothing useful for a search engine.
Disallow: /history
This is the visual history browser. The contents change far too rapidly to meaningfully index on a search engine.
Disallow: /geocoder
This is the on-site search system. Search engines searching search engines never ends well.
Disallow: /browse
Disallow: /*lat=
Disallow: /*node=
Disallow: /*way=
Disallow: /*relation=
These are obsolete URL hierarchies for browsing individual map elements. The current URL hierarchy, with URLs of the form https://www.openstreetmap.org/way/238241022, can be indexed by search engines.
Disallow: /user/*/traces/
Disallow: /user/*/diary
Disallow: /diary
These are the only entries that block pieces of the site that might be of interest to a search engine. /user/*/traces/
are the description pages for individual GPS traces, /user/*/diary
is individual diary entries, and /diary
is the main diary listing.