0% found this document useful (0 votes)
31 views13 pages

ChatGPT As A Mapping Assistant

Uploaded by

Giacomo Joggerst
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views13 pages

ChatGPT As A Mapping Assistant

Uploaded by

Giacomo Joggerst
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

ChatGPT as a mapping assistant: A novel

method to enrich maps with generative AI and


content derived from street-level photographs

Levente Juhász1 , Peter Mooney2 , Hartwig H. Hochmair3 , and Boyuan Guan1


1
GIS Center, Florida International University, Miami, FL 33199, USA
arXiv:2306.03204v1 [cs.CY] 5 Jun 2023

{ljuhasz,bguan}@fiu.edu
2
Department of Computer Science, Maynooth University, Co. Kildare, Ireland
[email protected]
3
Geomatics Sciences, University of Florida, Ft. Lauderdale, FL 33144, USA
[email protected]

Abstract. This paper explores the concept of leveraging generative AI


as a mapping assistant for enhancing the efficiency of collaborative map-
ping. We present results of an experiment that combines multiple sources
of volunteered geographic information (VGI) and large language models
(LLMs). Three analysts described the content of crowdsourced Mapil-
lary street-level photographs taken along roads in a small test area in
Miami, Florida. GPT-3.5-turbo was instructed to suggest the most ap-
propriate tagging for each road in OpenStreetMap (OSM). The study
also explores the utilization of BLIP-2, a state-of-the-art multimodal
pre-training method as an artificial analyst of street-level photographs
in addition to human analysts. Results demonstrate two ways to effec-
tively increase the accuracy of mapping suggestions without modifying
the underlying AI models: by (1) providing a more detailed description
of source photographs, and (2) combining prompt engineering with ad-
ditional context (e.g. location and objects detected along a road). The
first approach increases the suggestion accuracy by up to 29%, and the
second one by up to 20%.

Keywords: ChatGPT · OpenStreetMap · Mapillary · LLM · volun-


teered geographic information · mapping

1 Introduction
Generative Artificial Intelligence (AI) is type of AI that can produce various
types of content, including text, imagery, audio, code, and simulations. It has
gained enormous attention since the public release of ChatGPT in late 2022.
ChatGPT is an example of a Large Language Model (LLM), which is a form of
generative AI that produces human-like language. Since the launch of ChatGPT,
Paper submitted to The Fourth Spatial Data Science Symposium #SDSS2023.
2 L. Juhász et al.

researchers, including the geographic information science (GIScience) commu-


nity, have been trying to understand the potential role of AI for research, teach-
ing, and applications. ChatGPT can be used extensively for Natural Language
Processing (NLP) tasks such as text generation, language translation, writing
software code, and generating answers to a plethora of questions, engendering
both positive and adverse impacts [2]. The emergence of generative AI has in-
troduced transformative opportunities for spatial data science. In this paper we
explore the potential of generative AI to assist human cartographers and GIS
professionals in increasing the quality of maps, using OSM as a test case (Figure
1).

Fig. 1. OpenStreetMap roads and Mapillary images in the study area near Downtown
Miami

GeoAI has been part of the GIScience discourse in recent years. For example,
Janowicz et al. [7] elaborated on whether it was possible to develop an artificial
GIS analyst that passes a domain specific Turing test. While this questions is
still largely unanswered, our study contributes to early steps in this direction
by utilizing an LLM (ChatGPT) and a multimodal pre-training method (BLIP-
2) to connect visual and language information in the context of mapping. We
explore the larger question of whether generative AI is a useful tool in the context
of creating and enriching map databases and more specifically investigate the
following research questions:
ChatGPT as a mapping assistant 3

1. Is generative AI capable of turning natural language text descriptions into


the correct attribute tagging of road features in digital maps?
2. For this problem, can the accuracy of suggestions be improved through
prompt engineering [16]?
3. To what extent can the work of human analysts be substituted with gener-
ative AI approaches within these types of mapping processes?

Furthermore, our approach focuses on the fusion of freely available volun-


teered geographic information (VGI) [5] data sources (OSM, Mapillary) and
off-the-shelf AI tools to present a potentially low-cost and uniformly available
solution. OSM is a collaborative project that aims to create a freely accessi-
ble worldwide map database (https://openstreetmap.org), while Mapillary
crowdsources street-level photographs (https://mapillary.com) that power
mapping and other applications, such as object detection, semantic segmen-
tation and other computer vision algorithms to extract semantic information
from imagery [3]. While the use of VGI has not yet been explored in the context
of generative AI, previous studies demonstrated the practicability of combining
multiple sources of VGI to improve the mapping process [13]. More specifically,
Mapillary street-level images are routinely used to enhance OSM [8,10].

2 Study setup
2.1 Data sources and preparation
In OSM, geographic features are annotated with key-value pairs to assign the
correct feature category to them, a process called tagging [14]. For example,
roads are assigned a "highway"=<value> tag where <value> indicates a specific
road category, such as "residential" for a residential street.
OSM data is not homogeneous, and individual users may perceive roads dif-
ferently, and therefore assign different "highway" values to the same type of
road. A list of "highway" tag values was established to better describe the
meaning of each road category in OSM. Furthermore, the difference between
some road categories, e.g. “primary” and “secondary” is more of an administra-
tive nature rather than visual appearance. For example, a 2-lane road in rural
areas could be considered primary, whereas a more heavily trafficked road in
an urban environment might be categorized secondary. To consider semantic
road categories rather than individual "highway" values as one of the evalua-
tion methods, "highway" tag values representing similar roads in our dataset
were grouped into four categories (Table 1).
Figure 2 shows the methodology to obtain OSM roads of interest with corre-
sponding Mapillary street-level images. First, all OSM roads with a "highway"=*
tags were extracted within the study area. Then, short sections (<50m), inac-
cessible roads, sidewalks along roadways and roads without street-level photo
coverage were excluded. Retained OSM roads were matched with corresponding
Mapillary photographs, so that each road segment would have at least one repre-
sentative Mapillary image. Lastly, a list of objects detected in the corresponding
4 L. Juhász et al.

Table 1. Grouping distinct "highway" tag values into semantically similar categories.

Category name OSM "highway" # of roads


Major, access controlled road motorway|trunk 0
Main road primary|secondary|tertiary 81
Regular road residential|unclassified|service 4
Not for motorized traffic pedestrian|footway|cycleway 9

image was also extracted from the Mapillary API. These inputs were further
used as described in Section 2.3.

Fig. 2. Workflow for preparing input from Mapillary

2.2 Resources
AI tools and models We utilize GPT-3.5-turbo, which is an advanced lan-
guage model developed by OpenAI. It is an upgraded version of GPT-3, de-
signed to offer improved performance and capabilities and retains the large-scale
architecture of its predecessor, enabling it to generate coherent and contextually
relevant text [15]. GPT-3.5-turbo serves as a powerful tool for natural language
processing, content generation, and other language-related applications. In our
study it is used to suggest OSM tagging based on pre-constructed prompts using
the content of street-level images. The model was accessed through the OpenAI
API.
BLIP-2 [11] is a state-of-the-art, scalable multimodal pre-training method,
designed to equip LLMs with the capability to understand images while keep-
ing their parameters entirely frozen. This approach is built upon leveraging
frozen pre-trained unimodal models and a proposed Querying Transformer (Q-
Former), sequentially pre-trained for vision-language representation learning and
ChatGPT as a mapping assistant 5

vision-to-language generative learning. Despite operating with fewer trainable


parameters, BLIP-2 has achieved exceptional performance in a range of vision-
language tasks and shown potential for zero-shot image-to-text generation [11].
The methodology proposed in BLIP-2 contributes towards the development of
an advanced multimodal conversational AI agent. We leverage BLIP-2’s capa-
bility to generate image captions as well as to perform visual question-answering
(Q&A). A freely available sample implementation of BLIP-2 was used to conduct
this experiment (https://replicate.com/andreasjansson/blip-2).

Analysts Three analysts were tasked to describe the visual content of street-
level images (captioning), and to answer a few questions regarding the image
content (visual Q&A). Two (human) analysts were undergraduate students at
Florida International University with previous GIS coursework. BLIP-2 was used
to perform the same task as human analysts, and its responses were recorded as
the third (artificial) analyst. Analysts were deliberately not given any guidelines
as to how to describe images so that their answers would not be biased by prior
knowledge about OSM and mapping. Table 2 lists questions and tasks performed
by analysts.

Table 2. Questions and tasks performed by analysts.

Variable Question/task Example response


"caption" Describe what you see in the photo in your A city road in an urban area
own words. along an elevated railway.
There is a wide sidewalk on
both sides and trees on the
left.
"users" Who are the primary users of the road that Cars
is located in the middle of the photograph?
Cars, pedestrians or bicyclists?
"lanes" How many traffic lanes are there on the 3
road that is in the middle of the photo-
graph?
"surface" What is the material of the surface of the Asphalt
road that is in the center of the photograph
"oneway" Is the road that is in the center of the pho- No
tograph one-way?
"lit" Are there any street lights in the photo- Yes
graph?

The answers of analysts differ in level of detail. For example, BLIP-2’s and
Analyst #2’s captions were significantly shorter on average (9 and 11 words, re-
spectively) than Analyst #1’s (37 words). BLIP-2’s responses were also found to
be more generic (e.g. "a city street with tall buildings in the background")
6 L. Juhász et al.

than human analysts’. This allows us to explore the effect of providing increasing
detail on tag suggestion accuracy.

2.3 Methodology for suggesting OSM tags

Figure 3 shows the methodology for suggesting tags for an OSM road. For each
retained road in the area, the corresponding Mapillary images were shown to an-
alysts described in Section 2.2. Analysts created an image caption and answered
simple questions as described in Table 2. These responses in combination with
additional context were used to build prompts for an LLM to suggest OSM tags.
To explore what influences the accuracy of suggested tags, a series of prompts
were developed that differ in the level of detail that is presented to the LLM.
All prompts start with the following message that provides context and in-
structs the model about the expected output format.

Based on the following context that was derived from a street-level pho-
tograph showing the street, recommend the most suitable tagging for an
OpenStreetMap highway feature. Omit the ’oneway’ and ’lit’ tags if the
answer to the corresponding questions is no or N/A. Format your sug-
gested key-value pairs as a JSON. Your response should only contain this
JSON.

The remainder of individual prompts is organized into four scenarios con-


structed from the responses of analysts as well as additional context. Example
responses from analysts and additional context are highlighted in boldface.
The Baseline scenario uses only responses from analysts, and contains the
following text in addition to the common message above:

The content of the photograph was described as follows: A city road


in an urban area along an elevated railway. There is a wide
sidewalk on both sides and trees on the left. The road is mainly
used by: cars. The surface of the road is: asphalt.
When asked how many traffic lanes there are on the road, one would
answer: 3.
When asked if this street is a one-way road, one would answer: No.
When asked if there are any street lights in the photograph, one would
answer: Yes.

The Locational context (LC) enhanced scenario provides ChatGPT with


additional locational context describing where the roadways in questions are lo-
cated. In addition to the baseline message, it contains the following:

The photograph was taken near Downtown Miami, Florida.

The Object detection (OD) enhanced scenario uses a list of detected


objects in addition to the baseline:
ChatGPT as a mapping assistant 7

When guessing the correct category, consider that the following


list of objects (separated by semicolon) are present in the pho-
tograph: Temporary barrier; Traffic light - horizontal; Traffic
light - pedestrian; Signage

Finally, Object detection and locational context (OD + LC) are com-
bined into a new scenario that supplies both additional contexts for the language
model.

Fig. 3. Workflow of using ChatGPT to suggest OSM "highway" tags

The last step in the process is to supply the prompts described above to
GPT-3.5-turbo (ChatGPT for simplicity). The model responds with a JSON
document containing the suggested OSM tagging for the roadway, e.g. ("highway"
="primary", "lanes"= 3), which can be compared to the original OSM tags
of the same roadway.
The final dataset contains 94 OSM highway features and their original tags.
For four scenarios and three analysts described above, recommendations given
by ChatGPT based on the corresponding prompts were also recorded for the
same roadway, resulting in a total of 12 tagging suggestions. These suggestions
are then compared to the original OSM tags to assess the accuracy of a particular
scenario and analyst.
8 L. Juhász et al.

3 Results
3.1 Accuracy of suggesting road categories
Table 3 lists the correctness of ChatGPT suggested road categories based on
two different methods. First, we consider historical "highway" values of an OSM
road. A suggestion was considered correct if the current or any previous versions
of the corresponding OSM highway value matched ChatGPT’s suggested tag.
This step takes into account differences in how individual mappers may perceive
road features (e.g. primary vs. secondary). The second method is based on se-
mantic road categories listed in Table 1. Considering groups of roads as opposed
to individual "highway" values mitigates the fact that OSM tagging often fol-
lows administrative roles that are difficult to infer from photographs. Table 3
reports the accuracy of individual analysts across the four scenarios as well as
the average correctness for analysts (values on bottom) and scenarios (values in
different rows).

Table 3. Accuracy score of OSM tags suggested by ChatGPT. (LC = Locational


context, OD = Object detection, OD + LC = Object Detection + Locational context)

Based on historical "highway" values


Scenario BLIP-2 Analyst #1 Analyst #2 Avg. correct [%] % change
Baseline 23.4 37.2 31.9 30.8 -
LC 24.5 46.8 34.0 35.1 +4.3
OD 27.7 47.9 39.4 38.3 +7.5
OD + LC 30.9 47.9 50.0 42.9 +12.1
Avg. correct [%] 26.6 45.0 38.8

Based on semantic road categories


Scenario BLIP-2 Analyst #1 Analyst #2 Avg. correct [%] % change
Baseline 25.5 54.3 41.5 40.4 -
LC 35.1 64.9 45.7 48.6 +8.2
OD 29.8 63.8 60.6 51.4 +11.0
OD + LC 43.6 66.0 70.2 59.9 +19.5
Avg. correct [%] 33.5 62.3 54.5

BLIP-2 achieved the lowest accuracy among the three analysts, followed by
Analysts #2 and #1 respectively. This resembles the level of detail analysts
described photographs with, which suggests that in general, providing more de-
tailed image captions may lead to more accurate tag suggestions by ChatGPT.
This is further supported by the average accuracy achieved in different sce-
narios. The baseline scenario, which used prompts purely based on the visual
description of street-level photographs, achieved a suggestion accuracy of 30-40%
on average from the three analysts. Providing additional context in different sce-
narios increased this accuracy. Additional location context, i.e. specifying that
the roads are located near Downtown Miami (LC scenario) increased sugges-
tion accuracy by 4.3-8.2% on average, depending on the evaluation method. This
can potentially be explained by regional differences in OSM tagging practices
ChatGPT as a mapping assistant 9

which are usually determined by local communities. In this scenario, it is pos-


sible that the AI model considered these regional differences when suggesting
"highway" tags. Providing a list of objects detected in the source photographs
(OD scenario) increased the average suggestion accuracy by 7.5 - 11.0% com-
pared to the baseline scenario. A potential explanation for this is that objects
found on and near roads provide important details that help refine the road cat-
egory. Finally, combining the additional locational and object detection contexts
(OD + LC scenario) with the description of photographs by analysts increases
suggestion accuracy by 12.1 - 19.5% on average. It is important to mention that
these improvements are observed across all analysts.

3.2 Additional tag suggestions


In addition to the main "highway" category, information about additional char-
acteristics of roads can also be recorded in OSM. To assess such a scenario, we
analyze the "lit" tag, which indicates the presence of lighting on a particu-
lar road segment. The "lit" tag is set to "yes" if there are lights installed
along the roadway. One question explicitly asked analysts whether street lights
are visible on street-level photographs. In addition, street lights are a potential
object category in Mapillary detections. For the following analysis, we consider
the Object Detection enhanced scenario. The original dataset contains 24
roadways with "lit"="yes".
Table 4 shows that ChatGPT correctly suggested the presence of "lit" tag
between 63% (BLIP-2) and 92% (Analyst #2) of the existing cases. ChatGPT
suggested the use of the "lit" tag for an additional 44 - 61 features that are
potentially missing from OSM. Among these features, 59 have been suggested
based on prompts from at least two analysts, and 36 have been suggested based
on all three analysts.

Table 4. ChatGPT suggestions of the "lit" tag.

BLIP-2 Analyst #1 Analyst #2


Correctly tagged 15 (63%) 20 (83%) 22 (92%)
Additional 58 61 44

3.3 Limitations of the results


One limitation of this study is that we conducted the experiment on a small,
geographically limited, sample size. The implications of these are well studied in
the GIScience literature, such as the uneven coverage of street-level photographs
[9] and the heterogeneity of OSM tagging [4], that might limit the adaptability of
our method to different areas. The other group of limitations are largely related
to open problems in Computer Science, such as the so-called “hallucinations” of
generative AI, which results in content that is false or factually incorrect [1], as
well as the non-deterministic nature of ChatGPT’s answers [17].
10 L. Juhász et al.

4 Summary and discussion


This study presents an experiment utilizing ChatGPT as a mapping assistant
to suggest tagging of OSM roads based on textual descriptions of street-level
photographs. Furthermore, the workflow presented here relies on freely avail-
able sources of geospatial data (OSM, Mapillary) as well as off-the-shelf models
(GPT-3.5-turbo and BLIP-2). This presents unprecedented opportunities for
developing ways to improve the automated collaborative mapping process. Our
results show that ChatGPT suggests the correct OSM "highway" values from
natural language description of street-level photographs provided by human an-
alysts in 39-45% of the cases on average. This increases to 55 - 62% when consid-
ering semantic road categories rather than specific sub-types. These categories
would otherwise be impossible to distinguish from photographs (e.g. secondary
or primary roads). Another novel contribution of this study is the idea of sub-
stituting human analysts with the state-of-the-art BLIP-2 multimodal model
to describe photographs. Even though this was less accurate, this method still
resulted in 27 - 34% of correct suggestions. Overall, this suggests a first step to-
wards proposing highly automated mapping systems that do not rely on human
input.
Our study also explored ways to improve accuracy by prompt engineering and
providing additional context. In one scenario, GPT-3.5-turbo was instructed to
consider the location of roads (i.e. Miami, FL), which improved accuracy by 4
- 8% depending on the evaluation method. In another scenario, a list of objects
detected with computer vision algorithms were also provided in the prompt. This
increased accuracy by 8 - 11% on average. Finally, combining the additional lo-
cational context and object detection resulted in 12 - 20% increases in suggestion
accuracy. These improvements present a low-cost way to improve accuracy with-
out modifying or fine-tuning the underlying models. However, future research is
needed to explore and validate ways to use prompt engineering in the geospa-
tial context. The study also indicates another way ChatGPT can be utilized as
a useful mapping assistant: namely adding additional attribute information to
OSM roadways. ChatGPT suggested the presence of "lit" tag with 63 - 92%
accuracy depending on the analyst. In addition, ChatGPT also suggested to tag
an additional 44 - 61 road segments that might be missing the "lit"="yes" tag.
Our results suggest that increasing the level of detail with which a road scene
is described in a prompt increases the accuracy of the suggested highway tags.
This was found to be true based on the different details with which analysts
described the content of photographs in addition to both prompt-engineering
and providing additional context to the LLM. However, it is expected that there
are limitations of this method in terms of the accuracy that can be theoretically
achieved. Therefore, further research is needed to determine the best strategies
as well as the most appropriate level of detail.
There are multiple potential extensions of this study in the context of lever-
aging generative AI as a mapping assistant. First, we plan to increase the sample
size of OSM roads. This will enable us to assess performance under different con-
ditions, e.g. in different geographic regions and across different road categories.
ChatGPT as a mapping assistant 11

Future research will also focus on adding and refining more traffic-related infor-
mation to road maps, e.g. speed limit, biking infrastructure, turn restrictions.
While experiments like this are useful initial steps, we urge the GIScience
community to go beyond simply applying AI in geographic contexts and focus on
synergistic research that advance both the spatial sciences and AI research (see
e.g. [12,6]). There are multiple potential extension of this work along this idea
that go beyond the case study presented in this paper. For example, exploration
of a multimodal conversational AI agent for spatial data science is a promising
research direction. In theory, incorporating a spatial understanding component
in a multimodal AI system allows it to comprehend and analyze geospatial data.
This could result in a method for the AI to interpret and interact with geospatial
data, similar to how BLIP-2 enables language models to understand images.
Future research should focus on exploring the potential of this integration, and
on deepening our understanding of the theoretical and practical aspects of this
fusion. This, in turn will advance the field of research and lay the groundwork
for future innovations in comprehensive multimodal AI systems for geospatial
science.

Data availability The dataset supporting findings presented in this paper is


freely available at: https://doi.org/10.17605/OSF.IO/M9RSG.

Acknowledgements Authors would like to thank Mei Hamaguchi and Flora


Beleznay for providing image captions and visual Q&A.

References
1. Bang, Y., Cahyawijaya, S., Lee, N., Dai, W., Su, D., Wilie, B., Lovenia, H., Ji, Z.,
Yu, T., Chung, W., Do, Q.V., Xu, Y., Fung, P.: A Multitask, Multilingual, Mul-
timodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
(Feb 2023). https://doi.org/10.48550/arXiv.2302.04023, http://arxiv.org/abs/
2302.04023, arXiv:2302.04023 [cs]
2. Dwivedi, Y.K., Kshetri, N., Hughes, L., Slade, E.L., Jeyaraj, A., Kar, A.K.,
Baabdullah, A.M., Koohang, A., Raghavan, V., Ahuja, M., Albanna, H., Al-
bashrawi, M.A., Al-Busaidi, A.S., Balakrishnan, J., Barlette, Y., Basu, S., Bose, I.,
Brooks, L., Buhalis, D., Carter, L., Chowdhury, S., Crick, T., Cunningham, S.W.,
Davies, G.H., Davison, R.M., Dé, R., Dennehy, D., Duan, Y., Dubey, R., Dwivedi,
R., Edwards, J.S., Flavián, C., Gauld, R., Grover, V., Hu, M.C., Janssen, M.,
Jones, P., Junglas, I., Khorana, S., Kraus, S., Larsen, K.R., Latreille, P., Laumer,
S., Malik, F.T., Mardani, A., Mariani, M., Mithas, S., Mogaji, E., Nord, J.H.,
O’Connor, S., Okumus, F., Pagani, M., Pandey, N., Papagiannidis, S., Pappas,
I.O., Pathak, N., Pries-Heje, J., Raman, R., Rana, N.P., Rehm, S.V., Ribeiro-
Navarrete, S., Richter, A., Rowe, F., Sarker, S., Stahl, B.C., Tiwari, M.K., van
der Aalst, W., Venkatesh, V., Viglia, G., Wade, M., Walton, P., Wirtz, J., Wright,
R.: “so what if chatgpt wrote it?” multidisciplinary perspectives on opportuni-
ties, challenges and implications of generative conversational ai for research, prac-
tice and policy. International Journal of Information Management 71, 102642
12 L. Juhász et al.

(2023). https://doi.org/https://doi.org/10.1016/j.ijinfomgt.2023.102642, https:


//www.sciencedirect.com/science/article/pii/S0268401223000233
3. Ertler, C., Mislej, J., Ollmann, T., Porzi, L., Neuhold, G., Kuang, Y.: The Map-
illary Traffic Sign Dataset for Detection and Classification on a Global Scale. In:
Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) Computer Vision – ECCV
2020. pp. 68–84. Lecture Notes in Computer Science, Springer International Pub-
lishing, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1 5
4. Girres, J.F., Touya, G.: Quality assessment of the French OpenStreetMap dataset.
Transactions in GIS 14(4), 435–459 (2010). https://doi.org/10.1111/j.1467-
9671.2010.01203.x
5. Goodchild, M.F.: Citizens as sensors: the world of volunteered geography. Geo-
Journal 69(4), 211–221 (2007). https://doi.org/10.1007/s10708-007-9111-y
6. Janowicz, K.: Philosophical foundations of geoai: Exploring sustainability, diver-
sity, and bias in geoai and spatial data science (2023)
7. Janowicz, K., Gao, S., McKenzie, G., Hu, Y., Bhaduri, B.: Geoai: spatially ex-
plicit artificial intelligence techniques for geographic knowledge discovery and
beyond. International Journal of Geographical Information Science 34(4), 625–
636 (2020). https://doi.org/10.1080/13658816.2019.1684500, https://doi.org/
10.1080/13658816.2019.1684500
8. Juhász, L., Hochmair, H.H.: Cross-linkage between Mapillary Street Level Photos
and OSM Edits. In: Sarjakoski, T., Santos, M.Y., Sarjakoski, T. (eds.) Geospa-
tial Data in a Changing World: Selected papers of the 19th AGILE Conference
on Geographic Information Science, vol. Lecture Notes in Geoinformation and
Cartography, pp. 141–156. Springer, Berlin (2016). https://doi.org/10.1007/978-3-
319-33783-8 9
9. Juhász, L., Hochmair, H.H.: User Contribution Patterns and Completeness
Evaluation of Mapillary, a Crowdsourced Street Level Photo Service. Transac-
tions in GIS 20(6), 925–947 (2016). https://doi.org/10.1111/tgis.12190, https:
//onlinelibrary.wiley.com/doi/abs/10.1111/tgis.12190
10. Juhász, L., Hochmair, H.H.: How do volunteer mappers use crowdsourced Mapil-
lary street level images to enrich OpenStreetMap? (2017)
11. Li, J., Li, D., Savarese, S., Hoi, S.: BLIP-2: Bootstrapping Language-Image Pre-
training with Frozen Image Encoders and Large Language Models (May 2023).
https://doi.org/10.48550/arXiv.2301.12597, http://arxiv.org/abs/2301.12597
12. Li, J., Li, D., Xiong, C., Hoi, S.: BLIP: Bootstrapping Language-Image Pre-
training for Unified Vision-Language Understanding and Generation (Feb 2022).
https://doi.org/10.48550/arXiv.2201.12086, http://arxiv.org/abs/2201.12086,
arXiv:2201.12086 [cs]
13. Liu, L., Olteanu-Raimond, A.M., Jolivet, L., Bris, A.l., See, L.: A data fusion-based
framework to integrate multi-source vgi in an authoritative land use database.
International Journal of Digital Earth 14(4), 480–509 (2021)
14. Mooney, P., Corcoran, P.: The Annotation Process in Open-
StreetMap. Transactions in GIS 16(4), 561–579 (2012).
https://doi.org/10.1111/j.1467-9671.2012.01306.x, https://onlinelibrary.
wiley.com/doi/abs/10.1111/j.1467-9671.2012.01306.x, eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9671.2012.01306.x
15. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C.L., Mishkin, P., Zhang,
C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller,
L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., Lowe, R.: Train-
ing language models to follow instructions with human feedback (Mar 2022).
ChatGPT as a mapping assistant 13

https://doi.org/10.48550/arXiv.2203.02155, http://arxiv.org/abs/2203.02155,
arXiv:2203.02155 [cs]
16. Reynolds, L., McDonell, K.: Prompt Programming for Large Language Mod-
els: Beyond the Few-Shot Paradigm. In: Extended Abstracts of the 2021
CHI Conference on Human Factors in Computing Systems. pp. 1–7. CHI
EA ’21, Association for Computing Machinery, New York, NY, USA (May
2021). https://doi.org/10.1145/3411763.3451760, https://dl.acm.org/doi/10.
1145/3411763.3451760
17. Wang, S., Scells, H., Koopman, B., Zuccon, G.: Can ChatGPT Write a
Good Boolean Query for Systematic Review Literature Search? (Feb 2023).
https://doi.org/10.48550/arXiv.2302.03495, http://arxiv.org/abs/2302.03495,
arXiv:2302.03495 [cs]

You might also like