Share how addresses are abbreviated in your language #786
Labels
No labels
accessibility
addresses
Android
Android Auto
Android Automotive
api
app stores
battery & performance
borders
bug
build
CarPlay
community
contribution welcome
core
crash
cycling
desktop
devops
directions
docs
downloader
drape
driving
duplicate
editor
elevation
enhancement
EPIC
F-Droid
favourites
feature parity
fonts
good first issue
help wanted
icons
invalid
iOS
legal
linux phone
location
macOS
map data
mapgen needed
maps generator
navigation
need feedback
not planned
OM
opening hours
outdoor
POI info
priority
High
priority
Low
priority
Medium
privacy
public feedback needed
public transport
question
raw idea
refactoring
regional
regression
releases
route planning
routing
search
security
source data
styles
subway
task
tests
track recording
traffic
translations
TTS
UI
UX
wait for upstream
walking
Wikipedia
Windows
world map
No milestone
No project
No assignees
20 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
comaps/comaps#786
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In this issue, please share how address components (state/province, city, street, house number) are abbreviated in your language/country. This will help us expand our synonyms list for the search.
For example:
In English:
st. -> street, saint
ave. -> avenue
blvd. -> boulevard
Or you can edit the list and submit a PR.
Current list:
https://codeberg.org/comaps/comaps/src/branch/main/libs/search/query_params.cpp#L18
https://wiki.openstreetmap.org/wiki/Name_finder/Abbreviations
For English:
Line 31: {"dr", {"doctor"}}, > {"dr", {"drive", "doctor"}},
Not sure what that note on line 28 means.
Here's a great resource for abbreviations in English (shows proper abbreviations and misspellings too): https://pe.usps.com/text/pub28/28apc_002.htm
Arabic
Please share how addresses are abbreviated in your languageto Share how addresses are abbreviated in your languagePR linked has been merged
@jeanbaptisteC wrote in #786 (comment):
Yes, but this issue was more of a feedback one, we can still collect synonyms and implement in another PR.
(maybe re-open and pin it? WDYT?)
Copying from: #977 (comment)
Some i found missing are:
"b" ... "bei"
"st" ... "sancta", "sankt"
@vikiawv wrote in #786 (comment):
#1824
More abbreviations in Arabic
Doctor
University
@omarhassan wrote in #786 (comment):
#1835
german:
Str. → Straße (street)
St. → Sankt
@foss- wrote in #786 (comment):
Already done:
{"str", {"strasse", "stary", "stara", "strada", "straat", "stare", "straße"}},{"santo", "sant", "sint", "saint", "stara", "street", "stary", "stora", "sankt", "store", "stare", "stig",French:
av --> avenue
bd --> boulevard
ch --> chemin
imp --> impasse
res --> résidence
ZI --> zone industrielle
ZA --> zone d'activité
@G_LL_M wrote in #786 (comment):
#1868
Turkish
Neighborhood
Street
Avenue
@metehan wrote in #786 (comment):
#1914
SPANISH
Street: Calle → C/ or C.
Avenue: Avenida → Avda. or Avd.
Ringroad: Ronda → Rda.
Boulevard: Bulevar → Blvr.
Alley: Pasaje → Pje. (also Calleja or Callejón → Call.)
Path: Paseo → P. (also Camino → Cam.)
Road: Carretera → Ctra.
Roundabout: Glorieta → Gta. (also Rotonda → Rot.)
Slope: Cuesta → Cta.
Bridge: Puente → Pte.
Square: Plaza → Pza. or Pl.
I got all these from the list of abbreviations of the Royal Spanish Academy (RAE): https://www.rae.es/dpd/ayuda/abreviaturas. Note that these are the words and abbreviations used in Spain. Many of them are likely the same in other Spanish-speaking countries but I don't know.
The link provided by @matheusgomesms in #786 (comment): https://wiki.openstreetmap.org/wiki/Name_finder/Abbreviations is probably much more complete in terms of Spanish worldwide.
@Pol853 wrote in #786 (comment):
#1951
@x7z4w Thanks for the great work, I noticed you PRs always mentions synonyms not abbreviations?
Is this just for abbreviations or for synonyms as well ? I have some synonyms for Arabic but I have some questions
If I can add synonyms, how they are added ? (in
libs/search/query_params.cpp)for example if
A = B = C, is{"A", {"B", "C"}}enough or we need to add all combinations{"B", {"A", "C"}}{"C", {"A", "B"}}so that is user searches with A or B or C, all results appearThis will happen only in synonyms, as in abbreviations the small word will always map to larger words
Also another one, do this happens before or after search normalization ? (in
base/normalize_unicode.cpp)for example [not real abbreviation]
{"âốế", {"Any Other Entry"}}if user searches for "aoe" does this expansions happens as well and results having "Any Other Entry" appears ?
or should we have
{"aoe", {"Any Other Entry"}}[assuming that user query is normalized first then abbreviations expansion is done]@omarhassan wrote in #786 (comment):
Internally they're called synonyms. but these are abbreviations (st -> street, saint), as AFAIK search will still try to match the original query.
If you need synonyms for the POIs (cafe, coffee ->
amenity=cafe) please see.@omarhassan wrote in #786 (comment):
Yes, if you need:
1st -> first and first -> 1st,
then you need to add all combinations:
@omarhassan wrote in #786 (comment):
Not sure, probably after normalization, as it also tokenizes the query before matching synonyms.
@Pol853 wrote in #786 (comment):
I'd add:
Av. - Avenida
Esq. - Esquina
C.C. - Centro Comercial
Urb. - Urbanización
Res. - Residencias / Residencia
Edif. - Edificio
Qta. - Quinta
Pto. - Puerto
@patepelo wrote in #786 (comment):
#2033
Lithuanian
street - gatvė
st. - g.
square - skveras, aikštė
sqr. - skv., a.
avenue - prospektas
ave. - pr.
stop - stotelė
stop - st.
Czech #2073
@x7z4w wrote in #786 (comment):
The file contains duplicates - either the row is completely duplicated or the abbreviation is duplicated:
{"gr", {"grande rue", "grandes rues", "gracht", "grand’rue", "gränd", "graben", "grovet", "gränden", "grove"}},{"gr", {"großes", "große", "großer"}},{"ht", {"heights"}},{"ht", {"hinteres", "hinterer", "hinter…", "hintere"}},{"ir", {"ingenieur"}},{"ir", {"insinyur", "ingenieur"}},{"kap", {"kapelle"}},{"kap", {"kapitan"}},{"mgr", {"monseigneur"}},{"mgr", {"monseigneur"}},{"mjr", {"majora"}},{"mjr", {"majora", "major"}},{"mr", {"meester"}},{"mr", {"meester", "meander"}},{"pln", {"plaine", "plein"}},{"pln", {"plein"}},{"ppor", {"podporučíka"}},{"ppor", {"pporucznika", "podporucznika", "podporucznik"}},{"q", {"quadra", "quận"}},{"q", {"quelle"}},{"vd", {"van de", "van den", "van der"}},{"vd", {"vorderer", "vorderes", "vordere"}},{"大", {"国立大学法人", "公立大学法人"}},{"大", {"大学"}},{"o.l.v", {"onze-lieve-vrouw"}},{"o.l.v", {"onze-lieve-vrouw"}},@deivpaukst wrote in #786 (comment):
#2081
@ikanakova wrote in #786 (comment):
I created PR #2095 with the removal of duplicates.
I would suggest to leave this topic open ended, to gather more feedback of more languages. WDYT?
It was unpinned and there wasn't any activity recently.
You can re-open/pin it if you want so.
RE: pin, there is a hard limit to 3. I don't know who unpinned, there were also other good candidates to pin. I'm gonna reopen it but not pin it so it can be easily found on search.
@matheusgomesms wrote in #786 (comment):
I recently added a bunch of danish abbreviations to this page, so please import from there.
Some Romanian abbreviations:
Bl. - Bloc (Apartament block)
Sc. - Scara (Staircase entrance)
!!! Here blocks are tagged either like "Bl. X" or "Bloc X" depending on the mapper/time it was added, so if someone seaches "Bloc Y" the search should be able to return "Bl. Y" too. Same thing with Scara.
G-Ral - General (General)
Lt. - Locotelent (Liutenant)
Col. - Colonel (Colonel)
Dr. Doctor - (Doctor)
P-ța - Piața (Square/Market)
Pța - Piața
Str. - Strada
Bd. - Bulevardul (Boulevard)
Bd-ul - Bulevardul
Bdul - Bulevardul
Șos. - Șoseaua (Road)
Sf. - Sfântul (Saint)
Sf. - Sfânta (Saint)
Sf. - Sfinții (Saints)
Note: Slightly related and since I couldn't find an issue for this.
How do we solve for common translation in, for example, bilingual places.
E.g. in Catalan "Carrer" means "Street", in Spanish would translate to "Calle". Could we make it so that when represent this translation so that we could search for the streets this way?
@patepelo wrote in #786 (comment):
We don't have a "street" type, only highway classifications from OSM (which aren't necessarily representing e.g. "avenue" or "highways").
Ukrainian:
state/province -> обл.
city -> м.
district -> р-н. (район)
street -> вул. (вулиця)
avenue / prospect -> просп., пр-т. (проспект)
lane / alley -> пров. (провулок)
boulevard -> бул. (бульвар)
square -> пл. (площа)
descent -> узв. (узвіз)
dead-end -> туп. (тупик)
house -> буд. (будинок) ex: "буд. №9"
@teletext wrote in #786 (comment):
@st_ua wrote in #786 (comment):
#2912
Thanks @matheusgomesms, that sounds like a great resource! Croatia was missing completely there, so I've now invested some time to add a list of Croatian abbreviations at the wiki: https://wiki.openstreetmap.org/wiki/Name_finder/Abbreviations#Hrvatski_-_Croatian
I'd wait a few days so other Croatian mappers can update it (there was some interest in our local chat), and then it would be nice to import it in CoMaps.
Has someone systematically looked at what is available in that wiki and converted it into CoMaps PR?
@x7z4w wrote in #786 (comment):
I want to suggest, that you need to keep track on the chosen language of the map.
What I mean, lets say english, than "la tour eiffel" is named Eiffel Tower in search. With the chosen "keep names" you need to search for tour eiffel in all languages.
by the way, to contribute more abbreviations (for german)
1te -> erste -> 1.
2te -> zweite -> 2.
3te -> dritte
[any Number]te -> (N)te
prefixe
Groß*. (Große, Großer, Großes) -> gr./gr [Große Hauptstraße -> Gr. Hauptstr., Große Straße -> gr str]
Klein*. (Kleine, Kleiner, Kleines) -> kl./kl [Kleiner Gartenweg -> Kl. Gartenweg, Kleine Straße -> kl str]
suffixe
Land/(x)land (as suffix) -> (x)l
Deutschland -> Deutschl
Sauerland -> Sauerl
Straße (street) -> Str. or Str (without dot)
Especially in german we like to concat things.
Die Stardardnamenswegvonirgendeinemheini(straße) could be written in Stardardnamenswegvonirgendeinemheinistr(.),
Linux-Torvald-Str., Konrad-Zuse-Str. Lindenstr. (with small s) etc...
Special abbreviations for cities:
Frankfurt an der Oder (a river) -> F a.d.O (no, this is wrong), is found by Frankfurt (Oder), but one would search for Frankfurt Oder
Frankfurt am Main (the city of banks on a "male" river) -> F a.M./Frankfurt (Main) expected searched by Frankfurt Main [FrankfurtMain]
(?i)\bfrankfurt(?:\s+am|\s+)?\s(?:main|
?main ?)\bBrandenburg an der Havel -> Brandenburg (Havel)
Instead of a rivername this could also happen in addition to a region.
Menden (Sauerland) as shown on OSM, but really means Menden im Sauerland where Sauerland clarifies that the city with a redundant name lies in this region.
We would search for Menden Sauerland.
Here I must suggest that even the OSM data are not consequent and I could get a rule when a city has a suffix "an der Ruhr" ("female" river) or just a bracket + name "(Ruhr)".
An then there are common names overlaying official names like the "Blaues Wunder" (blue wonder) for the Loschwitzer Brücke in Dresden.
So my suggestion is, differenciate between search result (fully qualified name) and search text (WYSIWYG).
@developsman I think you've pasted the
reasoning_contentfrom an LLM, this is some gibberish. Please clarify.But I'll try to answer to at least some of these:
name:enetc. is already handled and unrelated to this issue.What do you mean?
This is what we do since 20 years and was used in many servers, especially LIMS and CRM related products.
The assumption without evidence does not appear very professional.
Why should
Frankfurt an der Oderbe shortened toF.o.d.O.this is a "Verschlimmbesserung" or improvement that is in reality worse than before.If you want to make a short name it should be recognisable like:
Frankfurt a. d. OderI'll try to explain what they mean with the city names in a more cohesive way. The rest are basically just simple abbreviations.
First let me explain the regions background. Other than the official levels of organisation like states, judicial districts, counties etc. there are a lot (unofficial) regions in Germany. Partly they are separate, partly they overlap with each other, partly they can even be inside a region as a subregion. Despite them not being official people generally know what they are.
Now coming to the actual issue. There can be multiple cities with the same name in Germany. Often to keep them apart the names are used in conjunction with a region or a river.
First let's start with regions. So to the name for the city of Menden in the region of Sauerland can be just Menden, Menden im Sauerland (Menden in the Sauerland region) or Menden (Sauerland). The official name might be just one of those variations, but people still might use and search for any of those variations or even a simple Menden Sauerland.
Sometimes the region can even be abbreviated like Hamm (Westf.) or Hamm (Westf) for the city Hamm in the region of Westfalen.
One other thing that comes into play here is that German words and even more so names can have a grammatical gender. So it is not always "im" like it is with Menden im Sauerland. It also can be "in" like Hamm in Westfalen, "in der" like Monschau in der Eifel etc.
With rivers it similar. There are famously two cities named Frankfurt and both are near a river. One near the river Main, one near the river Oder. There is Frankfurt am Main and Frankfurt an der Oder. They also can just be referred to just as Frankfurt (usually the far bigger one at the Main is meant then) or with the river in parenthesis like Frankfurt (Main) or Frankfurt (Oder).
Again grammatical gender also plays a role here. The river Main has a male grammatical gender and therefore it is "am" and the river Oder has a female grammatical gender so it is "an der". Complicating this even further there can be even abbreviations for the words in between like Frankfurt a. d. Oder or including them in parenthesis like Frankfurt (an der Oder) or Frankfurt (a. d. Oder).
These are just a limited set of examples, but it is the same for other cities, regions and rivers.
Lithuanian:
a. → aikštė
al. → alėja
aps. → apskritis
apskr. → apskritis
ež. → ežeras
g. → gatvė
k. → kaimas
m. → miestas
mst. → miestas
mstl. → miestelis
pr. → prospektas
r. → rajonas
raj. → rajonas
sen. → seniūnija
skg. → skersgatvis
skv. → skveras
st. → stotelė
vs. → viensėdis
@rimas I'd add the one I mentioned earlier too.
skv. → skveras
@rimas wrote in #786 (comment):
#3249
Spanish:
ppal. → principal
urb. → urbanización
dist. → distribuidor
@patepelo wrote in #786 (comment):
#3255
@yannikbloscheck wrote in #786 (comment):
The search would already match both Frankfurt, we don't need to add anything else for this.
I've addressed some of these synonyms in the latest and previous PRs.