A Deep Dive Into The Accuracy of IP Geolocation
A Deep Dive Into The Accuracy of IP Geolocation
Abstract—The quest for every time more personalized Internet experience relies on the enriched contextual information about each
user. Online advertising also follows this approach. Among the context information that advertising stakeholders leverage, location
information is certainly one of them. However, when this information is not directly available from the end users, advertising
stakeholders infer it using geolocation databases, matching IP addresses to a position on earth. The accuracy of this approach has
often been questioned in the past: however, the reality check on an advertising stakeholder shows that this technique accounts for a
large fraction of the served advertisements. In this paper, we revisit the work in the field, that is mostly from almost one decade ago,
arXiv:2109.13665v2 [cs.CY] 1 Jun 2022
through the lenses of big data. More specifically, we, i) benchmark two commercial Internet geolocation databases, evaluate the quality
of their information using a ground-truth database of user positions containing over 2 billion samples, ii) analyze the internals of these
databases, devising a theoretical upper bound for the quality of the Internet geolocation approach, and iii) we run an empirical study
that unveils the monetary impact of this technology by considering the costs associated with a real-world ad impressions dataset.
F
1 I NTRODUCTION
returns a bid-response, including the price it is willing to GeoIP based on our dataset. These values further corrobo-
pay for the offered ad space. The AdX runs a real-time rate the fact that GeoIP is the most common technology for
auction process based on the received bid-responses and providing location information in online advertising.
selects the winning DSP, which will handle the delivery of If an ad-request does not include a location context,
the ad impression to the user. depending on the kind of ad campaign, it will be unlikely
The ad delivery process is (obviously) subject to a mon- to find a matching user. Hence, location is definitely a very
etary transaction. The most common pricing schemes in sensitive parameter for the efficiency of an ad campaign,
online advertising are CPM (Cost per one thousand impres- which may range from coarser levels (i.e., country) to very
sions) and CPC (Cost Per each click on an ad impression). fine ones, targeting users at a zip code level.
Note that CPM and CPC are metrics that are known a
posteriori, once the campaign is finished. A proxy metric 2.3 Ground-truth location data
for the cost of an ad impression is the bid floor. This is
To achieve our goal of assessing the accuracy of GeoIP
a variable in the bid-requests that indicates the minimum
location data and its impact on online advertising, we
bidding price accepted by the publisher offering the ad
had to resort to a data source that provides high-precision
space.
geolocation information for an extremely high volume of
users.
2.2.2 Location data sources Multiple location providers collect high-precision lo-
DSPs have access to the location associated to an ad space cation information from users. Some examples are Safe-
through the location information embedded in bid-requests graph [21], Cuebiq [22], Foursquare [23], and Tamoco [24] to
from three possible data sources [19]. name a few. These providers use different techniques to ob-
- User: The location data is provided by the user and em- tain accurate location information from users, as described
bedded in the ad-request. For instance, location information next:
(e.g., an address) provided by the user through a registration - Embedded SDKs in mobile apps: The location provider
form. This type of location appears rarely in bid-requests. agrees to include its SDK in the mobile app(s) of a given
- GPS/Location Services: This type of data is expected to app developer. This SDK leverages the permission granted
provide high-precision, and in practice, it should directly by the end users to the “host” application and collects the
come from the positioning device of the user, offering GPS GPS location information from the device, as well as other
precision. Given the high-precision of the data, bid-requests, parameters, including the IP address.
including this type of location data, are expected to have a - Check-ins: The user proactively registers a check-in at a
higher starting bid price for the auction. specific venue (e.g., restaurants, coffee shops, etc.) when
- IP address: An important number of ad-requests leave it happens (usually to contextualize posts on online plat-
the user device without any location information. Due to forms), the accurate location of the venue is well-known,
the importance of location in online advertising, it is com- and thus users can be located with high-precision and
mon that one of the intermediaries in the ecosystem (e.g., with total transparency to them, as they are consciously
the AdX) enriches the ad-request or its correspondent bid- interacting with the app to provide such information.
request with location information based on the IP address of In this paper, we use a dataset from a location provider
the device. To this end, they use GeoIP databases described distributing an embedded SDK in mobile apps (see details
above. in §3) as our ground-truth information for the precise posi-
To understand the importance of GeoIP in the online tioning of end users.
advertising ecosystem, we have computed the fraction of
daily bid-requests including a GeoIP, GPS or unavailable 3 DATASETS
location received by TAPTAP Digital [20], a mid-size DSP This section describes the datasets and the evaluation sce-
(See details in §3), in its bid stream (i.e., the bid-requests narios we use in the remainder of the paper. In our study, we
flow). In particular, we have measured this metric for the limit the analysis to three major European countries (Spain,
bid-requests of the three countries analyzed in this paper France, and Great Britain) where online advertising presents
(Spain, France, and Great Britain) during a period of 16 a strong penetration and for which we have a good coverage
days. The results show that the average fraction of daily bid- in our datasets.
requests across the considered countries, including GeoIP,
GPS or unavailable location data is 52%, 18% and 30%,
respectively. In particular, 48,0%, 50,8%, and 57,3% of the 3.1 GeoIP Databases
bid requests for Spain, France, and Great Britain, include We leverage two of the most widely used GeoIP databases to
GeoIP location information, respectively. It is important to analyze the performance of the GeoIP location technology.
remark that most DSPs process bid requests with unavail- We keep the name of these providers anonymous since
able location data. They extract the IP address of the device our research aims not to scrutinize specific providers but
from the bid request and obtain an associated location from rather assess the performance of the GeoIP in the context of
a GeoIP database. This location data assignation technique online advertising. In the rest of the paper, we name these
allowS DSPs to effectively have a location for all received datasets that refer to these GeoIP databases as GeoIP-DB-A
bid requests. In summary, roughly half of the bid-requests and GeoIP-DB-B , respectively. Both providers offer their
(and up to 80% in those DSPs using the described location database as commercial products, have wide coverage in
data assignation technique) include locations extracted from the three considered countries, and update their database
IEEE TRANSACTIONS ON MOBILE COMPUTING 4
weekly. We collected regular snapshots of such databases Spain France Great Britain
to check the consistency of the data along the time. For in- Level 1 Post Code Post Code Post Code
Level 2 City/Municipality Commune Local Authority District
stance, GeoIP-DB-A includes all the IP-location samples be- Level 3 Province Department County / Region
tween July-2020 and May-2021. For GeoIP-DB-B , instead, Level 4 Autonomous Community Region Country
Level 5 Country Country Kingdom
we gathered data for the period April-2021 and May-2021.
These databases include all the information needed to match
TABLE 1: Administrative Levels considered in each country.
the IP address to a position and other side information such
as the kind of access technology associated with a given IP.
GT-DB in the remainder of the paper. As we report in
3.2 Bid stream dataset §4.7, the results obtained with the ground-truth data from
In this work, we measure the impact of the accuracy of our provider are aligned with those reported by a major
GeoIP for location-based online advertising by analyzing GeoIP Database provider. We believe that this represents a
real bid-request flows (a.k.a. bid stream) gathered from the significant hint about the quality of the used ground-truth
Sonata DSP [25] operated by TAPTAP Digital [20], a digital data.
marketing company operating in 15 countries. Sonata is
a mid-size DSP whose bid stream includes a large-scale
4 G EO IP PERFORMANCE
sample of bid-requests generated from Spain, France, and
Great Britain. In particular, we processed the bid stream This section evaluates the performance obtained by GeoIP
collected by Sonata between 1-May-2021 and 17-May-2021, under several scenarios relevant to the online advertising
which includes an average number of daily bid-requests of market. We present the overall methodology implemented
257.6M, 64.5M, and 54.1M for ES, FR, and GB, respectively. to compute GeoIP performance in §4.1, before analyzing the
While a bid-request may include several user context related different results in the following subsections.
features, we only process the fields that are relevant for our
study, namely: <timestamp; IP address; Location
4.1 Methodology
Source; lat,long>. The location source field corre-
sponds to those defined in §2.2.2: GPS, GeoIP, or User, or We benchmark GeoIP-DB-A and GeoIP-DB-B using as ref-
unavailable in case no source is reported. Note that for the erence the GT-DB database, which provides high-accuracy
analysis we only select the GPS information, as we can use location for several millions of users and precise time infor-
it as ground-truth information. mation that allows us to compare the ground-truth samples
to the proper instance of the GeoIP databases.
By joining GT-DB with GeoIP-DB-A and GeoIP-DB-B
3.3 (Ground-truth) GPS location data
we obtain two latitude, longitude pairs for the same IP ad-
To validate the performance of GeoIP, we use a dataset dress at a specific time, one belonging to GT-DB (used as
from a location provider distributing an embedded SDK in ground-truth, posGT ) and the other belonging to the GeoIP
mobile apps.1 This dataset reports GPS location coordinates, generated instance, posIP . Thus, for all the IP addresses
which we consider as the reference ground-truth position of the GT-DB database we compute the distance between
for the end users. This location data provider operates in posGT and posIP using the Haversine distance [26] formula,
more than 15 international markets including Spain, France, which yields the distance between any latitude,longitude
Italy, Great Britain, US, Mexico, Argentina, Colombia, South pairs on the earth. Formally,
Africa, etc. Its SDK is embedded in dozens of applications,
including popular applications such as weather, news or E = hav (posGT , posIP ) (1)
radio apps. It offers a coverage of at least 5% of the pop-
We used this approach in the rest of this section for
ulation in the main markets where it operates. Finally, our
assessing the databases’ Precision, which is a metric that
provider’s location data is used by customers across differ-
evaluates the pure distance.
ent sectors such as online advertising, retail, e-commerce,
However, ad campaigns usually include specific location
real state and financial services, among others.
targets that often correspond to concrete administrative
In particular, this dataset includes the following data
boundaries, such as countries, regions, cities, or zip codes.
tuple per location event: <timestamp; lat,long; IP
Therefore, in the context of online advertising, the perfor-
address; carrier>. The dataset spans a period of 30
mance of a GeoIP service should be measured by its capacity
days (from 1-Sep-20 to 30-Sep-20) and provides a very reli-
to locate users within the targeted administrative region
able snapshot of the mobile users. On average, the number
properly. We refer to this metric as Accuracy in this paper.
of daily location samples is 31M, 16M, and 20M for ES, FR,
For the accuracy analysis, we used the Shapefiles avail-
and GB, respectively. In total, we have for the three countries
able on the open data portals [27], [28] of the different
more than 2.05B data samples for the considered period. To
countries we analyzed and extracted the geographical extent
the best of the authors knowledge, this is the largest ground-
information related to the different administrative regions.
truth dataset ever used for analyzing the performance of
In order to increase the scalability of this analysis, we
GeoIP databases, increasing in several orders of magnitude
divided the space into a fixed grid using the Uber H3 [29]
the datasets used in previous studies. We refer to it as
geographical spatial index, to transform geographical joins
1. The name of this provider is kept anonymous due to its express into standard joins.
request. We formally define the accuracy metric A as follows:
IEEE TRANSACTIONS ON MOBILE COMPUTING 5
50
57.5
42
Density 48 Density
10000 Density
1000 10000
55.0
40
lat
lat
lat
100 100 100
10
46
1
1 1
38
52.5
44
36
−5 0 50.0
lon −5 0 5 −7.5 −5.0 −2.5 0.0
lon lon
Fig. 1: Number of anchor points in the analyzed countries. Brighter colors indicate a larger concentration of anchors. (Figure
best viewed in colors).
Spain France GB most densely populated areas of Spain, France, and Great
# IP ranges 139687 399500 1051937 Britain, such as the capitals and the most populous cities
# anchor points 5288 16367 10448 (e.g., Barcelona, Marseille, and the Liverpool–Manchester
Reuse factor 26.41 24.40 100.68
Megalopolis). In contrast, they present much lower reso-
lution in rural areas such as Castilla-La Mancha region
TABLE 2: Extent of the GeoIP-DB-A database.
in Spain, the Massif Central in France, and the Scottish
Highlands. As we quantify in §4.4, the lack of an anchor
point for these zones introduces a large error in the location
posIP ∈ R | posGT ∈ R estimation for the (fewer) users located there.
A= (2) Both GeoIP-DB-A and GeoIP-DB-B are periodically
posIP ∈ R
updated to account for movement among IP ranges and
where R is the targeted spatial region associated to e.g., refine the location estimation according to their algorithm.
an administrative division. Our accuracy analysis considers However, we did not notice any substantial deviation in
5 different administrative levels from smaller (Level 1) to the computed precision over time. Considering a 30 days
larger (Level 5) size as reported in Tab. 12 . time window, the median daily error recorded from the two
databases only shown a variance of 1.34 m, 2.51 m, and
4.2 Space and Time variability 11.64 m for Spain, France, and Great Britain, respectively3 .
Before analyzing the performance of GeoIP, we discuss For this reason, unless otherwise stated, in the rest of the
in this subsection some overall statistics of the analyzed paper, we analyze a time window of 30 days without
GeoIP datasets. As introduced in §2, location information distinguishing between weekend, weekdays, day or night,
is actually inferred based on IP prefixes rather than IP as the time dynamics involved in the update process are
addresses (i.e., contiguous IP addresses usually share the probably longer.
same position).
We report in Table 2 the GT-DB extent in the three coun- 4.3 Global cross-country comparison
tries under study, obtained by performing an exhaustive 4.3.1 Precision
search on the entire IP addresses space. Besides the number We first study the overall precision attained by
of different ranges and anchor points, we also compute the GeoIP-DB-A and GeoIP-DB-B in the reference countries
reuse factor, i.e., the number of IP ranges that are mapped by showing the CDF of E in Fig. 3. The precision distribu-
to the same position. tion shows poor behavior in the three countries, and the
The analysis of the reuse factor shows a good correlation two explored GeoIP databases, which shows very similar
with the average population density in the specific coun- results. For instance, the median error for GeoIP-DB-A
tries: Spain and France, with a population density of 92.76 (GeoIP-DB-B ) in Spain, France, and Great Britain is
and 123.28 persons per Km2 , present a reuse factor around 14.01 Km (14.91 Km), 13.61 Km (14.56 Km), 15.70 Km
25 (26.41 and 24.40, respectively). Great Britain, instead, has (18.9 Km), respectively. In addition, there are only a few
a much higher population density (279.95 persons per Km2 ) samples with an error below 1Km (at best 12.1% for
that is reflected by a higher reuse factor as well, 100.68. GeoIP-DB-A in France and 10.6% for GeoIP-DB-B in
Fig. 1 depicts the spatial landscape of the |posIP | set, Spain), while the percentage of samples with a very low
which reports a similar conclusion. The algorithms imple- precision beyond 100 Km grows up to 24% in the best case
mented by the GeoIP-DB-A database accurately match the (in Great Britain, for GeoIP-DB-B ).
2. Note that Levels 1 and 2 are not always hierarchical. For instance, 3. This corroborates the fact that providers are constantly improving
there are some zip codes in rural areas in Spain that include several their records, as the average deviation of the reported IP prefixes
villages. position is much higher, as reported by [18]
IEEE TRANSACTIONS ON MOBILE COMPUTING 6
Empirical CDF
FR
GB
60% GB 60%
40% 40%
20% 20%
0% 0%
Level 1 Level 2 Level 3 Level 4 Level 5 Level 1 Level 2 Level 3 Level 4 Level 5 100m 1Km 10Km 100Km 1000Km 100m 1Km 10Km 100Km 1000Km
Empirical CDF
80%
Semi-urban Semi-urban Semi-urban
60% Rural Rural Rural
40%
20%
0%
100m 1Km 10Km 100Km 1000Km 100m 1Km 10Km 100Km 1000Km 100m 1Km 10Km 100Km 1000Km
Fig. 4: Precision breakdown per urbanization level
Spain France Great Britain
100%
Urban Urban Urban
80% Semi-urban Semi-urban Semi-urban
60% Rural Rural Rural
40%
20%
0%
Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4
Fig. 5: Accuracy by administrative regions and level of urbanization (dashed lines represent the overall accuracy).
administrative levels, except Level 1 (i.e., zip code). The 4.5.2 Accuracy
Spanish case, in particular, showcases very large differences It is indeed remarkable that not even the use of the WiFi
between A measured in urban areas and rural areas: for technology yields good results for the most challenging
the Level 2 we can observe a dramatic drop from 47% to scenarios: the best case (Level 2 in GB) only achieves A
11%, further corroborating the considerations done in §4.3.2. equal to 52.7%, while in France this value drops to 29.0%
Contrarily, France presents comparable results among the for the same administrative level. Fig. 7 confirms the pro-
urbanization degrees, being the least unequal of the an- nounced unreliability of IP-based geolocation for cellular
alyzed countries. This effect may be because of the more access technologies, with A that are often below 10% for
uniform spread of anchor points across the country. the most challenging scenarios and around 50% for the least
ones (only the Level 4 in GB seems to be well mapped).
4.5 The influence of the access technology
4.5.1 Precision 4.6 The variation across different ISPs
Regardless of the type of devices that are used to gain 4.6.1 Precision
access to the Internet (e.g., a desktop PC, a laptop, or a The algorithms employed by GeoIP-DB-A to map IP ad-
mobile phone), a factor that likely affects on the precision dresses to a posIP may be computed using active la-
of GeoIP services is the access network technology. It seems tency measures between known milestones on the Internet.
obvious that pinpointing the location of a mobile device Hence, the number and the internal configuration of prefixes
will generate larger errors than estimating the position of for the different ISPs could have a relevant impact on
a device connected through a broadband fixed-access tech- E . We assess this by further split the precision yield by
nology (e.g., Fiber, ADSL, or WiFi). users connected through their mobile interface (discussed
Databases such as GeoIP-DB-A and GeoIP-DB-B usu- in §4.5) into the different ISPs. For this purpose, we use the
ally offer, for user targeting purposes, also the kind of access information available in the GT-DB database, which collects
technology associated to a given IP address. However, our the carrier name displayed in the mobile terminal. Fig. 8
ground-truth database GT-DB is built using data coming shows the achieved E for the four most relevant carriers in
from mobile terminals, which likely only have two types each of the considered countries.
of access network technologies: WiFi and mobile. We observe different notable behavior in the yielded
Fig. 6 shows E according to the connection interface E for all the countries. In Spain, there is almost an order
inferred from GeoIP-DB-A for Spain, France, and Great of magnitude difference for the median E between the
Britain. The behavior is consistent across the countries. As least and the most precise ISPs. For the French case, this
expected, cellular connections lead to much larger errors difference is even broader, with SFR as the best option
than WiFi. For most percentiles in the distribution, the gap with a remarkably high median precision of 4.5Km, and
exceeds one order of magnitude in all the countries. Even Orange as the worst case for GeoIP location purposes with
more, the number of location samples obtained through E =173.2Km in median. Considering that both operators are
the mobile network where E ≤1Km are anecdotal (3% for the most popular in France, according to their popularity
the best case, in France). The extremely bad performance in the GT-DB dataset, we ascribe this difference to a worse
of cellular connections is not compensated by WiFi. Even performance of the position matching algorithm used by
for the best case, in Spain, only 17% of the users could GeoIP-DB-A . Finally, the ISP choice in GB has the lowest
be located within 1Km, corroborating that this technology, impact among the analyzed countries, with a close gap be-
at least for the analyzed databases, is a no-go for precise- tween the median E of the analyzed operators. In a nutshell,
location targeted advertising. despite few cases, for all the analyzed operators, the 25th
IEEE TRANSACTIONS ON MOBILE COMPUTING 8
Empirical CDF
80%
Mobile Mobile Mobile
60%
40%
20%
0%
100m 1Km 10Km 100Km 1000Km 100m 1Km 10Km 100Km 1000Km 100m 1Km 10Km 100Km 1000Km
50%
25%
0%
Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4
Fig. 7: Accuracy by administrative regions and connection type. (dashed lines represent the overall accuracy)
100Km
10Km
1Km
Yoigo Vodafone Movistar Orange Orange Bouygues NRJ SFR Three Sky TMobile O2
40%
20%
0%
Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4
Fig. 9: Accuracy by administrative regions and ISPs (dashed lines represent the overall accuracy)
percentile of E is above 10Km. This confirms that making a mix of active and passive measurements that blackbox the
fine-grain selection of GeoIP locations per operator cannot core networks and the interconnections of ISPs.
be used for precise location-targeted advertising or other
similar services.
4.7 Providers’ reported performance
Most GeoIP Databases provide high-level reporting about
4.6.2 Accuracy
the offered precision and/or accuracy except for Maxmind,
We measure A for the different carriers in Fig 9. As expected, that offers a detailed reporting [31]. Despite Maxmind’s
the carriers that yield a lower median E generally translate report does not cover as many dimensions as we cover in
into a higher A. However, the quite large differences in our research, it offers precision data at several thresholds
the median precision observed in France do not translate (10, 25, 50, 100, and 250km) and accuracy values at two ad-
into very large differences in terms of accuracy, while the ministrative levels (zip code and city) for around a hundred
less dispersed situation in GB yields to a quite diverse countries. An important difference with Maxmind is that we
performance for some carriers, especially for the Level 2 provide a detailed description of our methodology to study
divisions. Instead, the differences in Spain in terms of E have the precision and accuracy of GeoIP, whereas Maxmind
a more direct relationship to A, as GeoIP-DB-A reaches does not disclose their methodology.
the lowest values for the Yoigo operator. This hints at the We have compared Maxmind’s and our outcome for
complexity of the task GeoIP databases perform: a complex the three analyzed countries in this paper. The results are,
IEEE TRANSACTIONS ON MOBILE COMPUTING 9
100%
ES ES FR GB
ES
Empirical CDF
80%
FR 100%
60% GB
FR
40%
50%
GB 20%
0% 0%
0% 5% 10% 15% 20% 100m 1Km 10Km 100Km Level 1 Level 2 Level 3 Level 4
Fig. 10: A for GeoIP-DB-A to the best possible anchor point (left), E (center) and A (right) achieved by the best possible
scenario.
in general, well-aligned. This means our study is the first answer this question by computing the fraction of users
academic validation of the correctness of the precision and that are mapped (with both posGT and posIP within the
accuracy results reported by Maxmind. same Voronoi cell. If this happens, then it means that the
selected posIP is actually the best possible one among the
set of available anchor points. If not, it means that there was
5 D ISSECTING THE G EO IP INTERNALS
an anchor point closer to posGT than posIP that was not
In this section, we go one step further and try to analyze selected. The results of this analysis are shown in the left
the different components that may contribute to the lack part of Fig. 10
of performance analyzed in §4. Both GeoIP-DB-A and In this situation, GeoIP-DB-A cannot go beyond an
GeoIP-DB-B do not disclose the algorithm and techniques overall accuracy above 20% in the best case (Spain), i.e.,
they use to provide the mapping between IP and location, more than 80% of the IP addresses are not mapped to the
although we know from generic statements published on best location. This is even more dramatic for the GB case,
the vendor websites and from the literature [] that they are where just 6% of the addresses are mapped to the best
likely using a mixture of active and passive measurement, anchor points.
possibly combined with machine learning technologies and This corroborates the complexity of the tasks that GeoIP
datasets close to GT-DB . providers face: while they can quite effectively map more
While proposing improved solutions for GeoIP is out densely populated areas with more anchor points, the users’
of the scope of this paper, in this section we propose a geographical spread (especially for the ones using the mo-
methodology to discover the upper bound of the GeoIP bile network, as we analyze in §4.5.1) makes very difficult
performance, a metric that we will leverage for the online to condensate IP ranges into the best possible anchor point.
advertising case study discussed in §6.
granularity, as the GT-DB population always has an anchor Then, by using the accuracy obtained with the GeoIP and
point within 10 Km in the vast majority of cases, but ii) end GPS technologies (AIP and AGP S ) we can calculate the
users micro-mobility largely spoils the achieved granularity. Effective Cost (φ), that is defined as the normalized cost of
We claim that if GeoIP providers were able to account for correctly delivering an ad to a user located in the targeted
this micro-mobility in their mapping, the impact on the final area, and is calculated as follows:
applications such as online advertising would be huge, as
we discuss in the following section. ∗
CIP ∗
CGP S
φIP = φGP S = (4)
AIP AGP S
6 T HE IMPACT OF G EO IP ON ONLINE ADVERTIS -
Hence, the best expenditure strategy is defined by the
ING
min(φIP , φGP S ). For a given location-targeted campaign,
This section aims to estimate what is the impact of using by estimating the accuracy and cost for the two technologies,
GeoIP locations on online advertising campaigns. While the a DSP can steer its strategy according to this rule. Later
extensiveness of the GT-DB dataset used in §4 allowed us in this section, we empirically evaluate A as well as φ for
to understand the performance of the GeoIP overall, that different real-world ad campaign scenarios.
dataset may include all kinds of users, not only the ones Note that our methodology is not considering the poten-
which are actively targeted by ad providers. To this end, we tial economic side benefits/harms of showing ads to users
use our bid stream dataset to generate the ground-truth data outside the targeted area. For instance, a potential benefit
to guarantee that all the location samples are actually linked might be expanding the knowledge of a new brand to neigh-
to users targeted by online advertising campaigns. boring areas of the specific location target. Instead, potential
To measure the performance of a campaign, we rely harm might be bothering users with ads uninteresting to
on the accuracy (A) measure described in §4. However, them, which in addition introduces a waste of resources
to understand the best buying strategy from an economic (e.g., bandwidth [33] and battery).
point of view, in addition to the accuracy, we also have to
consider the monetary cost C associated with different types 6.1.2 Bid stream ground-truth dataset
of bid-requests, i.e., including GeoIP or GPS information. In
In order to precisely measure AIP , we have to select
order to isolate the monetary impact that the type of location
a set of bid-requests for which we know the end users
data has on advertising campaigns, we need to factor out
ground-truth location. We do this by keeping exclusively
other elements affecting the economic performance of a
the bid-requests that include a GPS location (See §2.2),
campaign. To this end, we make the following assumptions:
hence creating a reliable association between the users’
i) the bid stream has been filtered so that the available bid-
IP addresses and their position posGT (we assume that
requests already meet the goals of the campaign in terms
AGP S = 100%). Then, we retrieve the location infor-
of the targeted audience; ii) there is sufficient ad inventory
mation from the Geolocation databases using the IP ad-
of each type of location data (GeoIP vs. GPS) to meet the
dress, obtaining posIP −A and posIP −B . Our ground-truth
defined objective of the campaign in terms of the number
dataset includes the following information: <timestamp;
of ad impressions delivered, so that the advertiser/DSP can
IP address;posGT ;posIP −A ;posIP −B >.
freely choose to buy any combination of GeoIP and GPS
Moreover, in order to retrieve the value of CGP S and CIP
bid-requests to meet such objective.
we rely on the bid floor information available in our bid
6.1 Methodology stream dataset. Note that for estimating φGP S and φIP the
relevant information is not the absolute price value for GPS
6.1.1 Best bidding strategy and GeoIP ad impressions but the relative relation between
In this section, we model the best strategy that could be them (CGP∗ ∗
S and CIP ). Hence, our assumption here is that
followed by an advertiser to issue a specific targeting ad the ratio of GeoIP and GPS price value is well captured by
campaign, based on the characteristics of the location tech- the ratio of their corresponding bid floor prices.
nology and their associated cost. The goal of the advertiser
is to maximize the value for money for every ad campaign. 6.1.3 Simulation set-up
Let us introduce this in a toy example, where the GPS Our goal is to create a simulation set-up that mimics real
accuracy is 100% by definition and the GeoIP accuracy of location-targeted ad campaigns. For this purpose, we follow
the bid requests in this campaign is 20% (i.e., the location the guidance from industry players, such as TAPTAP Digi-
of the targeted user matches the location defined by the ad tal, to set up realistic values for our simulation parameters
campaign once every five times). In this case, if the average as described next:
cost of the GPS bid request is twice the cost of the GeoIP Campaign duration: We set up a campaign duration be-
bid request, it would be more economically effective to buy tween 1 and 2 weeks, which is a very common time frame
GPS bid requests. However, if the cost of GPS bid request used by advertisers for their ad campaigns.
was 6 times the cost of GeoIP bid requests, it would be more Win rate: This parameter defines the fraction of won bid-
economically effective to buy the latter. requests out of all the bids run by a DSP in an ad campaign.
To model this behaviour, we introduce the normalized
∗ ∗ We configure a win rate range between 20 and 40% in our
cost related to each technology (CIP and CGP S ), computed reference ad campaigns.
as:
Ad impression cost: We use the bid floor as a proxy metric
∗ CIP ∗ CGP S to estimate the cost of ad impressions. We have computed
CIP = CGP S = (3)
min (CIP , CGP S ) min (CIP , CGP S ) the CIP∗ ∗
and CGP S defined above as the median value of
IEEE TRANSACTIONS ON MOBILE COMPUTING 11
bid floor prices for GeoIP and GPS bid-requests collected GIP quantitatively compares the value increase (decrease)
∗
for Spain, France, and Great Britain across 16 days. CIP is 1 in accuracy for the GeoIP with the increase (decrease) in
∗
for the three countries, whereas CGP S is 1.01, 2.34, and 2.08 their cost. Thus, positive (negative) values of GIP provide a
for Spain, France, and Great Britain, respectively. quantitative reference of the expected order of magnitude
Geographical target: We consider campaigns targeting all improvement (harm) of setting a strategy to buy GeoIP
administrative levels introduced in Table 1 but the coun- instead of GPS bid-requests.
try level. As discussed in §4, GeoIP services have per- Finally, note that we compute AIP , AGPS , φIP , φGP S
fect accuracy in providing the location at country level. and GIP for both: i) the Actual mapping of the IP addresses
Then, it is expected that country level campaigns have an location to the anchor points implemented in GeoIP-DB-A ,
A ≈ 100%. Note that the 4 levels used in our simulations and ii) the Optimal assignment of IP addresses to the closest
(state, province, city, and zip code) are frequently used as anchor point, as discussed in §5.
targeted-locations in online advertising campaigns.
Urbanization level: A major portion of location-targeted 6.2 Results
advertising campaigns focus on urban areas. Then, our
simulations will focus on this type of areas. Note that the We note that the results presented in this section correspond
urbanization level is only meaningful for administrative to GeoIP-DB-A . For the sake of simplicity, we do not report
Levels 1 (zip code) and 2 (city) since we cannot select a the results associated with GeoIP-DB-B which lead to the
province or a state which is entirely urban or rural. same conclusions.
For each of the considered countries (Spain, France, and
Great Britain), we configure 4 campaign models based on 6.2.1 Accuracy
the geographical target and urbanization level: a) Level 4, b) Fig. 11 shows the accuracy from the ad campaigns simula-
Level 3, c) Level 2-Urban, and d) Level 1-Urban. Overall, we tions for the four geographical targets introduced in §6.1.3
have a total of 12 simulation scenarios. For each simulation in Spain, France and Great Britain when we consider the
scenario, we randomly select 5 different targets that meet Actual (left side) or the Optimal (right side) mapping of IP
its criteria, with the exception of Great Britain, which does addresses location to anchor points.
not account for Level 4 as it is in general yielding very high The results of the Actual allocation strategy follow the
accuracy (see Fig.2), hence generating 3 total targets. expected pattern for A: the larger is the geographical target,
Overall, we have 58 different target-locations in our the higher is the accuracy. Using Spain to illustrate this
stimulation set. Finally, for each of the 58 campaigns, we observation: A grows from 5.25% for campaigns targeting
run 3 repetitions where we set up a value of campaign zip codes in urban areas to 58.45% when the campaign
duration and win rate randomly selected from the range resolution is at the state level.
defined above for these parameters. In addition, it is interesting to notice that the accuracy
varies considerably across countries in all the geographical
6.1.4 Campaign execution targets, except for Level 3. Also, the accuracy reported with
We execute the simulated campaigns on the bid stream the GT-DB dataset (See Fig. 2 in §4) shows more evenly
coming from our ground-truth dataset. We filter only the spread behavior across countries. This suggests that the
bid-requests, including a posIP −A or posIP −B location users targeted by online advertising can be a rather skewed
matching the geographical target of the ad campaign in the subset of the overall population that can be reached by high-
selected time period. We just consider a random fraction precision location providers discussed in §4.
of the bids from the obtained subset according to the win When analyzing the Optimal assignment of IP addresses
rate defined for the campaign. The final set of bid-requests to the closest anchor point, we find that it largely outper-
resulting from this process represents the actual set of deliv- forms the Actual allocation irrespective of the geographical
ered ad impressions by the ad campaign. target we consider, as expected. The worst case in the Opti-
mal allocation (A = 73.18%) corresponding to the zip code
6.1.5 Evaluation metrics level in urban areas in Spain is only 20 percentage points
smaller than the best case in Actual allocation algorithm
First, we compute the Accuracy (AIP ) metric to assess the
(A = 93.16%), which comes from the sate level in GB.
impact of GeoIP location data in online advertising. We
In conclusion, the average A for the Optimal allocation
measure the accuracy on the delivered ad impressions as
strategy yields advantages for all geographical resolutions.
the fraction of them whose associated posGT falls within the
If the GeoIP services were capable to approximate this Op-
specific geographical target of the ad campaign. For each
timal performance, advertisers using location-targeted cam-
of the 58 simulated campaigns, we compute the average
paigns would experience a significant improvement in their
AIP across the three performed repetitions. Note that as campaigns’ KPIs without requiring any further investment.
indicated above, AGPS = 100%.
Second, we compute φIP vs. φGP S using the expressions 6.2.2 Optimal budget strategy
defined in Eq. 4 to identify the technology (GeoIP or GPS)
yielding the most economically efficient campaign. Tab. 3 shows the best buying strategy (i.e., buying GeoIP vs.
Third, using the values of φIP and φGP S , we define the GPS bid-requests) to be applied in each of the campaigns
Gain (GIP ) of an ad campaign, as follows: run for the four geographical targets introduced in §6.1.3
for the three countries as a result of comparing φIP and
φGP S in the simulated campaigns. Results are grouped by
φGP S
GIP = log (5) target location. For each target location, the table shows the
φGeoIP
IEEE TRANSACTIONS ON MOBILE COMPUTING 12
−2
ES FR GB
TABLE 3: Best technology (GPS vs. GeoIP) to set the buying
−3
Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4
strategy based on the analysis of φ.
Urban Urban Urban Urban
our finding that the error of GeoIP-based locations of IP our study leverages over 2B ground-truth samples with GPS
addresses (or prefixes) using cellular access connections is precision. This allows us to present the most comprehensive
significantly larger that those using fixed connections. study of the GeoIP performance, studying it up to a zip code
The closest literature to our study is formed by studies resolution and covering the impact of several relevant fac-
that analyze the accuracy of GeoIP databases. In one of tors such as the level of urbanization, the access technology
the earliest studies on the topic, Poese et al. [13] use data or the specific ISP.
from an ISP to analyze the performance and accuracy of As a final remark, to the best of the authors’ knowledge,
5 different GeoIP databases. In particular, they find that there is only one company, Location Sciences [42], offering
none of these databases make a good mapping of the actual location data auditing products in the online advertising
IP prefixes used by the ISP. Furthermore, they also map ecosystem. Unfortunately, as all other auditing solutions in
the location of each IP prefix to the location of the Point- online advertising [43], [44], [45] their products are propri-
of-Presence (PoP) where the associated backbone router etary and it is unknown how they operate or which is their
is located. Unfortunately, this location ground-truth might actual performance.
be significantly less accurate than GPS coordinates from a
mobile device as we use in this paper. In an almost parallel 8 C ONCLUSION
study in time, Shavitt at al. [37] compare the performance
To the best of the authors’ knowledge, our study is: 1) the
of 6 GeoIP databases. They use two types of ground-truth
one that provides a deepest understanding of the perfor-
datasets: the geographical location of PoPs and a ground-
mance of GeoIP databases; 2) the first one providing an
truth database, including the location of 25k IP address up
upper bound of the performance these systems may offer
to the level of city. The paper uses the precision as the
and 3) The first one analyzing its impact on the online
studied performance metric. The authors also analyze the
advertising business. These three elements constitute (in our
correlation between the error of different GeoIP databases.
humble opinion) an important contribution to researchers
There are very few papers in the literature using ground-
and practitioners and make our paper novel compared to
truth data based on GPS location information. Triukose et
any other previous study in the context of GeoIP databases.
al. [38] leverage the GPS location provided by a mobile
In this paper, we present an analysis of two GeoIP
app, and assess the error of GeoIP location services using
databases, that are arguably among the most widespread
the IP address of the device. Complementary to our study,
technologies used to locate devices around the entire world,
this paper shows evidence that NATed IP addresses offer a
especially in the context of online advertising. To the best
worse location accuracy than public IP addresses. However,
of our knowledge, our study is: i) the one that provides
this study present an important limitation since the dataset
the deepest understanding of the performance of GeoIP
only include information about devices connected through
databases; ii) the first one providing an upper bound of the
cellular (3G/GPRS) technology. In a similar study, Komosny
performance these systems may offer, and iii) the first one
et al. [39] use 700 mobile devices from which they recover
analyzing its impact on the online advertising business.
the GPS location to construct a ground-truth dataset to
Armed with a dataset of 2B samples that includes a
evaluate the performance of 8 different GeoIP databases.
ground-truth location associated with an IP address, we
While, these studies rely on GPS ground-truth data, their
study the performance of GeoIP databases through sev-
dataset is formed by tens of thousands of location samples
eral unexplored dimensions so far: urban vs. rural areas,
compared to more than 2B samples in our dataset.
access technologies, or ISP providers. Our work revisits
Finally, there are some previous works complementary
the quantitative findings of previous studies regarding the
to ours, which analyze the performance of GeoIP databases
performance issues of this technology and extends them to
in geolocating network infrastructure elements. Instead, we
understand their causes better.
are interested in analyzing the performance in the geoloca-
Thanks to the extensiveness of our data, we can fur-
tion of end users. Gharaibeh et al. [40], use a ground-truth
ther dig into the performance of GeoIP databases, showing
dataset including the city level location of 16.5K router in-
possible causes behind the lack of accuracy and discussing
terface IP addresses, whereas Iordanou et al. [41] focus their
how, under ideal conditions, the overall precision could be
analysis on the location of servers. Both works conclude
improved by two orders of magnitudes.
that GeoIP databases are highly inefficient in geolocating
Finally, we prove that from a budgetary perspective,
network infrastructure elements.
GeoIP may be, in some cases, a better technology for ge-
Our study presents three major contributions in com-
ographically targeted ad campaigns compared to more pre-
parison with the previous literature: 1) we present the first
cise geolocation technologies (i.e., GPS) due to the expected
benchmark analysis about the upper-bound performance
higher cost of the latter. The most efficient technology in
that GeoIP could offer (see §5); 2) To the best of the authors’
economic terms is the one that better balances accuracy and
knowledge, all existing works analyze the GeoIP databases
cost. This is initially a counter-intuitive result since most of
performance in an isolated manner and just briefly mention
the literature in the area mostly focuses on reporting the
which businesses might be affected by the reported inac-
poor location capacity of GeoIP databases.
curacy of GeoIP. Instead, we present, for the first time, a
detailed quantitative analysis of the potential impact of the
extensive use of GeoIP in online advertising, which arguably ACKNOWLEDGEMENTS
represents the most important business where GeoIP is This research received funding from the European Union’s
applied; 3) We present the most thorough study of GeoIP Horizon 2020 innovation action programme under the PIM-
performance in terms of scale and resolution. In particular, CITY project (Grant 871370) and the TESTABLE project
IEEE TRANSACTIONS ON MOBILE COMPUTING 14
(Grant 101019206); the Agencia Estatal de Investigación New York, NY, USA: Association for Computing Machinery, 2001,
(AEI) under the ACHILLES project (Grant PID2019- p. 173–185. [Online]. Available: https://doi.org/10.1145/383059.
383073
104207RB-I00/AEI/10.13039/501100011033); the Spanish [13] I. Poese, S. Uhlig, M. A. Kaafar, B. Donnet, and B. Gueye, “Ip
Ministry of Economic Affairs and Digital Transformation geolocation databases: Unreliable?” ACM SIGCOMM Computer
and the European Union-NextGenerationEU through the Communication Review, vol. 41, no. 2, pp. 53–56, 2011.
UNICO 5G I+D 6G-RIEMANN-FR; the agreement between [14] Maxmind. (2021) GeoIP2 Databases . [Online]. Available:
https://www.maxmind.com/en/geoip2-databases
the Community of Madrid and the Universidad Carlos III
[15] D. Element. (2021) Netacuity Industry-Leading IP Geolocation
de Madrid for the funding of research projects on SARS- Data . [Online]. Available: https://www.digitalelement.com/
CoV-2 and COVID-19 disease, project name ”Multi-source solutions/
and multi-method prediction to support COVID-19 policy [16] IP2Location, “Identify geographical location and proxy by ip
address,” https://www.ip2location.com/, 2021.
decision making”, which was supported with REACT-EU
[17] Neustar, “Ip intelligence,” https://www.home.neustar/security-
funds from the European regional development fund “a intelligence/ip-geopoint, 2021.
way of making Europe; and the TAPTAP-UC3M Chair in [18] M. Gouel, K. Vermeulen, O. Fourmaux, T. Friedman, and R. Bev-
advanced AI and Data Science applied to advertising and erly, “Ip geolocation database stability and implications for net-
work research,” pp. 19–33, 2021.
marketing.
[19] I. T. Lab. (2021) OpenRTB (Real-Time Bidding). [Online].
Available: https://iabtechlab.com/standards/openrtb
[20] Taptap Digital, “Omnichannel advertising and marketing intel-
R EFERENCES ligence powered by location,” https://www.taptapdigital.com/,
[1] A. Daviel, F. Kaegi, and M. Kofahl, “Geographic 2021.
extensions for http transactions,” Working Draft, IETF Sec- [21] Safegraph. (2021) The Source of Truth for Places Data . [Online].
retariat, Internet-Draft draft-daviel-http-geo-header-05, December Available: https://www.safegraph.com/
2007, http://www.ietf.org/internet-drafts/draft-daviel-http-geo- [22] Cuebiq. (2021) Mobility data that fuels growth . [Online].
header-05.txt. [Online]. Available: http://www.ietf.org/internet- Available: https://www.cuebiq.com/
drafts/draft-daviel-http-geo-header-05.txt [23] Foursquare. (2021) Foursquare location data platform . [Online].
[2] W3C. (2018) Geolocation API Specification 2nd Edition. [Online]. Available: https://foursquare.com/
Available: https://www.w3.org/TR/geolocation-API/ [24] Tamoco. (2021) The World’s Smartest Location and Geospatial
[3] M. Fiore, P. Katsikouli, E. Zavou, M. Cunche, F. Fessant, Company . [Online]. Available: https://www.tamoco.com/
D. Le Hello, U. Aivodji, B. Olivier, T. Quertier, and R. Stanica, “Pri- [25] TapTap. (2021) Sonata, Global Platform for Mobile-Centric
vacy in trajectory micro-data publishing: a survey,” Transactions on Audience Engagement . [Online]. Available: https://www.
Data Privacy, vol. 13, pp. 91–149, 2020. sonataplatform.com/
[4] S. Rodriguez Garzon and B. Deva, “Geofencing 2.0: Taking [26] G. V. Brummelen, Heavenly Mathematics: The Forgotten Art of Spher-
location-based notifications to the next level,” in Proceedings of the ical Trigonometry. Princeton University Press, 2013.
2014 ACM International Joint Conference on Pervasive and Ubiquitous [27] U. Kingdom. (2021) Open Data Portal . [Online]. Available:
Computing, ser. UbiComp ’14. New York, NY, USA: Association https://data.gov.uk/
for Computing Machinery, 2014, p. 921–932. [Online]. Available: [28] France. (2021) Open Data Portal . [Online]. Available: https:
https://doi.org/10.1145/2632048.2636093 //www.data.gouv.fr/
[5] IAB, “Mobile Programmatic Playbook,” https://www.iab.com/ [29] Uber. (2021) H3: Hexagonal hierarchical geospatial indexing
wp-content/uploads/2015/05/MobileProgrammaticPlaybook. system . [Online]. Available: https://h3geo.org/
pdf, 2015. [30] Eurostat, “Degree of urbanisation (degurba),” https://ec.europa.
[6] IAB, “Speaking the same language in location-based mar- eu/eurostat/web/degree-of-urbanisation/background, 2018.
keting,” https://www.iab.com/blog/location-based-marketing-
[31] Maxmind. (2021) GeoIP2 City Accuracy. [Online]. Available: https:
glossary/, 2019.
//www.maxmind.com/en/geoip2-city-accuracy-comparison
[7] R. Gonzalez, C. Soriente, and N. Laoutaris, “User profiling in
the time of https,” in Proceedings of the 2016 Internet Measurement [32] F. Aurenhammer, Voronoi diagrams and Delaunay triangulations.
Conference, ser. IMC ’16. New York, NY, USA: Association Hackensack, New Jersey: World Scientific, 2013.
for Computing Machinery, 2016, p. 373–379. [Online]. Available: [33] B. Pourghassemi, J. Bonecutter, Z. Li, and A. Chandramowlish-
https://doi.org/10.1145/2987443.2987451 waran, “AdPerf: Characterizing the Performance of Third-Party
[8] T. Theodoridis, S. Papadopoulos, and Y. Kompatsiaris, “Assessing Ads,” Proc. ACM Meas. Anal. Comput. Syst., vol. 5, no. 1, Feb. 2021.
the reliability of facebook user profiling,” in Proceedings of [Online]. Available: https://doi.org/10.1145/3447381
the 24th International Conference on World Wide Web, ser. [34] B. Gueye, S. Uhlig, and S. Fdida, “Investigating the imprecision of
WWW ’15 Companion. New York, NY, USA: Association for ip block-based geolocation,” in Passive and Active Network Measure-
Computing Machinery, 2015, p. 129–130. [Online]. Available: ment, S. Uhlig, K. Papagiannaki, and O. Bonaventure, Eds. Berlin,
https://doi.org/10.1145/2740908.2742728 Heidelberg: Springer Berlin Heidelberg, 2007, pp. 237–240.
[9] J.-W. van Dam and M. van de Velden, “Online profiling and [35] R. Padmanabhan, J. P. Rula, P. Richter, S. D. Strowes, and
clustering of facebook users,” Decision Support Systems, vol. 70, A. Dainotti, “Dynamips: Analyzing address assignment practices
pp. 60–72, 2015. [Online]. Available: https://www.sciencedirect. in ipv4 and ipv6,” in Proceedings of the 16th International
com/science/article/pii/S0167923614002796 Conference on Emerging Networking EXperiments and Technologies,
[10] J. Estrada-Jiménez, J. Parra-Arnau, A. Rodrı́guez-Hoyos, and ser. CoNEXT ’20. New York, NY, USA: Association for
J. Forné, “Online advertising: Analysis of privacy threats and Computing Machinery, 2020, p. 55–70. [Online]. Available:
protection approaches,” Computer Communications, vol. 100, pp. https://doi.org/10.1145/3386367.3431314
32–51, 2017. [Online]. Available: https://www.sciencedirect.com/ [36] M. Balakrishnan, I. Mohomed, and V. Ramasubramanian,
science/article/pii/S0140366416307083 “Where’s that phone? geolocating ip addresses on 3g networks,”
[11] J. M. Carrascosa, J. Mikians, R. Cuevas, V. Erramilli, and in Proceedings of the 9th ACM SIGCOMM Conference on Internet
N. Laoutaris, “I always feel like somebody’s watching me: Measurement, ser. IMC ’09. New York, NY, USA: Association
Measuring online behavioural advertising,” in Proceedings of for Computing Machinery, 2009, p. 294–300. [Online]. Available:
the 11th ACM Conference on Emerging Networking Experiments https://doi.org/10.1145/1644893.1644928
and Technologies, ser. CoNEXT ’15. New York, NY, USA: [37] Y. Shavitt and N. Zilberman, “A geolocation databases study,”
Association for Computing Machinery, 2015. [Online]. Available: IEEE Journal on Selected Areas in Communications, vol. 29, no. 10,
https://doi.org/10.1145/2716281.2836098 pp. 2044–2056, 2011.
[12] V. N. Padmanabhan and L. Subramanian, “An investigation of [38] S. Triukose, S. Ardon, A. Mahanti, and A. Seth, “Geolocating ip
geographic mapping techniques for internet hosts,” in Proceedings addresses in cellular data networks,” in International Conference on
of the 2001 Conference on Applications, Technologies, Architectures, Passive and Active Network Measurement. Springer, 2012, pp. 158–
and Protocols for Computer Communications, ser. SIGCOMM ’01. 167.
IEEE TRANSACTIONS ON MOBILE COMPUTING 15
[39] D. Komosny, M. Vozňák, and S. Rehman, “Location accuracy of Ángel Cuevas received the M.Sc. (2007), and
commercial ip address geolocation databases,” Information Tech- the Ph.D.(2011) degrees in Telematics Engineer-
nology And Control, vol. 46, 09 2017. ing from the University Carlos III of Madrid. He is
[40] M. Gharaibeh, A. Shah, B. Huffaker, H. Zhang, R. Ensafi, currently an Associate Professor in the Depart-
and C. Papadopoulos, “A look at router geolocation in public ment of Telematic Engineering, University Carlos
and commercial databases,” in Proceedings of the 2017 Internet III of Madrid. He is a co-author of more than 70
Measurement Conference, ser. IMC ’17. New York, NY, USA: papers in prestigious international journals and
Association for Computing Machinery, 2017, p. 463–469. [Online]. conferences, such as the IEEE/ACM TRANSAC-
Available: https://doi.org/10.1145/3131365.3131380 TIONS ON NETWORKING, the ACM Transac-
[41] C. Iordanou, G. Smaragdakis, I. Poese, and N. Laoutaris, tions on Sensor Networks, Computer Networks
“Tracing cross border web tracking,” in Proceedings of the Internet (Elsevier), the IEEE NETWORK, the IEEE Com-
Measurement Conference 2018, ser. IMC ’18. New York, NY, USA: munications Magazine, USENIX Security, WWW, ACM CoNEXT, and
Association for Computing Machinery, 2018, p. 329–342. [Online]. ACM CHI. His research interests focuses on Internet measurements,
Available: https://doi.org/10.1145/3278532.3278561 web transparency, privacy, and P2P networks. He was a recipient of the
[42] Location Sciences, “Location sciences,” https://www. Best Paper Award at ACM MSWiM 2010.
locationsciences.ai/, 2021.
[43] Double Verify, “Double verify,” https://doubleverify.com/, 2021.
[44] Human, “Bot Mitigation — Know Who’s Real,” https://www.
humansecurity.com/, 2021.
[45] Integral Ad Science, “Integral ad science,” https://integralads.
com/, 2021.