Studying Solutions of The P-Median Problem For The Location of Public Bike Stations
Studying Solutions of The P-Median Problem For The Location of Public Bike Stations
1 Introduction
Driving a vehicle through the city is an increasingly difficult and annoying task.
A large number of traffic jams at different times of the day, the rising cost of fuel,
and the rising level of pollution in cities are some of the biggest problems for
citizens. For all these reasons, more and more citizens are looking for alternative
and sustainable ways to move around the city. Bicycles are a good way to get
around: they are clean and environmentally friendly, beneficial to the health, and
help to avoid getting trapped in traffic jams. As a result, an increasing number
of people use bicycles as their main means of transport in the city. Municipalities
have become aware of this trend and try to promote their use by providing the
c Springer Nature Switzerland AG 2018
F. Herrera et al. (Eds.): CAEPIA 2018, LNAI 11160, pp. 198–208, 2018.
https://doi.org/10.1007/978-3-030-00374-6_19
p-Median Solutions for Locating Public Bike Stations 199
necessary infrastructure, such as areas to place them and bike lanes. In addition,
numerous initiatives have been taken by local authorities and private companies
to promote the so-called bike-sharing. To manage the distribution of bicycles,
stations are usually set up where bicycles can be picked up/dropped off. But
finding the best location for those facilities is not a trivial job.
In this work, we tackle the public bike stations location problem (PBSLP). To
do so, we formulate the problem as a classical problem of localization, namely
the p-median problem. The p-median problem tries to identify, given a set of
locations and customers and distances between locations and customers, a subset
of locations of size p such that the total distance of travel of customers to the
closest location is minimized. In other words, the goal is to identify locations so
that the average distance of customers to bike stations is minimized. While we
are not the first to consider this problem to a possible model for the PBSLP
[17], we study here different variants of the problem including different distance
metrics. This classic formulation is accompanied by real data from the city of
Malaga, Spain, which allows us to test our proposal and variants of the p-median
problem as close as possible to the reality of the city. Finally, an advantage of
this modeling is that the p-median problem is well-studied and many efficient
and effective algorithms are available for it. We further enhance the performance
of some popular p-median algorithms by an automatic algorithm design process.
The rest of this article is organized as follows: Sect. 2 presents the formulation
of the problem. Section 3 describes the selected optimization algorithm. Section 4
analyzes the main results of our work using as a real scenario the city of Malaga.
We discuss related work in Sect. 5 and conclude in Sect. 6.
the p-median may serve as a good proxy to identify interesting locations and
relevant possible distributions of the bicycle stations across an urban area.
While there are other formulations of the problem [2,10,12], these may in
part require various types of information. Differently, the p-median problem
requires rather little information (only the distance matrix and possibly the
weights). Although this may seem like a limitation, it is relatively simple to add
additional information, either by pre-processing the weights or distances, (e.g.,
by considering the slopes of the streets) or by adding terms in the formulation
itself (e.g., by adding capacity information related to the number of bikes in each
bicycle parking site). Hence, in the present work we define a baseline for future
works that will take these advanced characteristics into account.
3 Algorithm
The algorithm, that we use in this article, is based on the variable neighborhood
search (VNS) algorithm for the p-median problem presented by Mladenovic and
Hansen [15]. It was selected as the base for our development because the VNS
algorithms and variants of it have been used in localization problems, and, in
particular, for the p-median problem reaching very good performance [1,16,18].
While there are a number of VNS variants for the p-median problem, we have
developed a component-wise implementation of such algorithms. In particular,
we use algorithmic components that have been used for the original, basic VNS
algorithm [15] (OVNS), and two more recent variants proposed in [7] (BVNS and
DVNS). In order to find an improved version of the VNS, we use an automatic
algorithm configuration stage through the use of the iterated racing procedure
implemented by the irace package [13]. We used a budget (number of configu-
rations) of 5000 and, as training instances, we chose instances of the TSPLIB1
library. The locations of the cities correspond to the locations of the customers
and the possible facilities. As in the practical problem we tackle the number of
p with small values, we used here values of p ∈ {10, 20, 30} for each instance
of the set of training instances. For the configuration process we considered the
following components together with associated numerical parameters:
Neighborhood Model. We used two models to select the (k) neighbors: near-
est points (NEAR) and the division of the space into four quadrants from
the point and selecting (if possible) an equal number of neighbors in each
quadrant (QUAD).
Local Search. We apply a local search procedure to the initial solution
(localsearch1) and in each iteration of the algorithm (localsearch2). The possi-
ble local search algorithms are: Fast Interchange [20], IALT [8] (Laux param-
eter), and IMP [7].
Shake. We use two different solution modifications: either we exchange n facili-
ties with randomly chosen other ones in the neighborhood (NEIGHBOR) or
we select then randomly in the whole set of facilities (RAND).
1
TSPLIB instances: http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/tsp/.
p-Median Solutions for Locating Public Bike Stations 201
Initial Solution. We have four procedures to generate the initial solution: ran-
dom (RAND), START [7], the best of 100 random solutions (100RAND), and
the best of performing IMP to 100 different random solutions (100IMP).
Table 1 presents the parameters information used by irace. The last column
shows the best values found by irace. These values form the parameters of the
VNS algorithm used in the experiment. While we leave a detailed evaluation
of the configured VNS algorithm to previously proposed VNS algorithms to an
extended version of the paper, in what follows we focus on the usage of the
configured algorithm to study the location of public bike stations in Malaga.
Table 1. Parameters for the automatic configuration process. Given are the parameter,
its type, its domain of possible values, the condition under which the parameter is rel-
evant (conditional parameters) and the value the parameter takes in the automatically
obtained configuration.
4 Experimental Study
As mentioned, our study focuses on the usage of the p-median problem to sug-
gest possible solutions that may be implemented in the real world to tackle the
PBSAP. For this reason, the use of actual data is really important in this work.
We have decided to use the city of Malaga as a study object as it has a wide
variety of open data (see municipality open data website: http://datosabiertos.
malaga.eu/) and has already a functioning shared bikes system with 23 public
accessible stations, which will allow us to compare our proposed solutions by the
p-median problem to the actual situation.
The p-median problem needs two sets of points: customers and facilities. The
customers are the citizens. As positions where these are located, we have chosen
the centers of the different neighborhoods (363) of the city (we have excluded six
neighborhoods at the outskirts of Malaga) and the weights are the real number
202 C. Cintrano et al.
Fig. 1. Public bike stations in the city of Malaga (blue points) and the center of each
neighborhood (red points). (Color figure online)
We next present the main results of our case study and compare them
with the real scenario. We used real data and the tuned VNS to optimize each
instance. The algorithm was running 60 CPU seconds and reports the best solu-
tion found in each of the trials. The implementation was done in C++, compiled
with flag -O3. The computation platform used is a desktop computer with one
Intel i5-4460 processor running at 3.2 GHz, and 8 GB memory. We have carried
out 100 runs of the algorithm to statistically compare the results obtained.
to this last one as real distance. We have also taken into account the popula-
tion density in each neighborhood by weighing the problem using the number
of citizens and compare it to how solutions would look if these weights were
not taken into account. This gives us four scenarios to study, corresponding to
weighted or unweighted customers locations with Euclidean or real distances. We
used the same number of stations that are currently in use in Malaga, that is,
p = 23. Figure 2 shows the empirical cumulative distribution of the percentage
of improvement in the objective function value of each scenario when compared
to the result of calculating the objective function for the actual location of the
23 bike stations in Malaga (evaluated according to the corresponding scenario).
1.00
0.75
Fn(x)
0.50
^
0.25
0.00
45 50 55 60 65 70
Improvement over Malaga (%)
Table 2. Distance traveled per inhabitant to the nearest station in the different sce-
narios, evaluated as real weighted distance. The minimum values are marked in bold.
calculated the average distance that the citizens must travel to their nearest
station. Table 2 shows this distance information for each scenario. In general,
our solutions reduce the city council’s solution by 500 m (on average per person
even in the worst case). As expected, the real distance with weights obtained
the best results when taking this distance into account during the optimization
process. It is interesting to note that using weights in the Euclidean distance case
reduces the average distance walked more than when changing the distance com-
putation from Euclidean to real distances. This tells us that using demographic
p-Median Solutions for Locating Public Bike Stations 205
700
500
300
30 40 50 60 70 80 90 100
information is more important than the type of distance used, if one wants to
improve the quality of service to the user.
5 Related Work
In bicycle sharing systems there are multiple problems to address such as pre-
dicting the filling of stations [19], the location-allocation of bicycles [3,5], routes
206 C. Cintrano et al.
Table 3. Walking distance for different numbers of additional stations; the number of
additional stations is p − 23, where p is the number given in the first column.
for users or the transfer of bicycles [9], etc. There are also complete solutions
that take into account multiple aspects of bike sharing systems [11]. However,
they are very complex solutions that require large amounts of information that
are not always available.
If we focus on the optimal location of the stations we find solutions such
as those proposed in [2]. In this paper, the authors use real data and machine
learning techniques to find the best places to position the stations. A similar
approach to the above is used in [12], where they use New York City, which has
a large network of bike stations. In [17] a comparison between two models for the
station location problem is presented. The authors compare the p-median and
the MCLP. However, they do not give details about the algorithm that is used.
In [4] the authors try to find the best locations so as to ensure the availability
to collect/deposit the bicycles, taking into account the demand (at peak times)
and the possible routes between stations made by users. These works involve a
demand on the use of the bike-sharing system. However, they do not take into
account potential users who could use the system if it were closer to them.
Our work uses demographic data to bring the system closer to all users (which
has not been taken into account in other work). In addition, existing solutions
use a custom formulation, making it difficult to compare them with other state-
of-the-art solutions. We formulate the problem of locating bike stations as a
p-median which allows us to use a solid base for our studies, as well as to enrich
ourselves with all the advances made in the p-median problem.
p-Median Solutions for Locating Public Bike Stations 207
6 Conclusions
In this work we have modeled the public bike stations location problem as a
classic problem of location, the p-median problem. To solve it we used a VNS
algorithm that was automatically configured with irace. We have studied differ-
ent scenarios: Euclidean and real distances (calculated by Dijkstra’s algorithm),
and whether or not to consider population densities. As might be expected, the
most realistic scenario (real distances and weights) reported the best results, and
all configurations performed much better than the Malaga public bike-sharing
system. Finally, we have studied where ideally locations should be added if the
current public bike system is extended by new stations. Already with few addi-
tional stations, we have obtained substantial reductions in the average distance
of customers to bike stations; overall the improvements have been in the range
of 43–77% with respect to the current situation in Malaga.
As future work we want to test our model in bigger cities like Madrid or New
York, try other heuristic algorithms and integer linear programming solvers, as
CPLEX; and variants of the p-median problem such as minimizing the distance
to customers while maximizing the distance between facilities or the capacitated
version. It is also interesting include more realistic data such as the bike trips,
cycle lanes, traffic patterns, point of interest in the city (schools, hospitals, muse-
ums, etc.), or stops of other public transport, to promote multimodal transport.
References
1. Avella, P., Boccia, M., Salerno, S., Vasilyev, I.: An aggregation heuristic for large
scale p-median problem. Comput. Oper. Res. 39(7), 1625–1632 (2012)
2. Chen, L., et al.: Bike sharing station placement leveraging heterogeneous urban
open data. In: Proceedings of the 2015 ACM International Joint Conference on
Pervasive and Ubiquitous Computing - UbiComp 2015, pp. 571–575. ACM Press,
NY (2015)
3. Chen, Q., Liu, M., Liu, X.: Bike fleet allocation models for repositioning in bike-
sharing systems. IEEE Intell. Transp. Syst. Mag. 10(1), 19–29 (2018)
4. Chen, Q., Sun, T.: A model for the layout of bike stations in public bike-sharing
systems. J. Adv. Transp. 49(8), 884–900 (2015)
5. Chira, C., Sedano, J., Villar, J.R., Cámara, M., Corchado, E.: Urban bicycles
renting systems: modelling and optimization using nature-inspired search methods.
Neurocomputing 135, 98–106 (2014)
6. Dantrakul, S., Likasiri, C., Pongvuthithum, R.: Applied p-median and p-center
algorithms for facility location problems. Expert Syst. Appl. 41(8), 3596–3604
(2014)
7. Drezner, Z., Brimberg, J., Mladenović, N., Salhi, S.: New heuristic algorithms for
solving the planar p-median problem. Comput. Oper. Res. 62, 296–304 (2015)
208 C. Cintrano et al.
8. Drezner, Z., Brimberg, J., Mladenović, N., Salhi, S.: New local searches for solving
the multi-source Weber problem. Ann. Oper. Res. 246(1–2), 181–203 (2016)
9. Hu, S.R., Liu, C.T.: An optimal location model for a bicycle sharing program
with truck dispatching consideration. In: 17th International IEEE Conference on
Intelligent Transportation Systems (ITSC), pp. 1775–1780. IEEE, October 2014
10. Kloimüllner, C., Raidl, G.R.: Hierarchical clustering and multilevel refinement for
the bike-sharing station planning problem. In: Battiti, R., Kvasov, D.E., Sergeyev,
Y.D. (eds.) LION 2017. LNCS, vol. 10556, pp. 150–165. Springer, Cham (2017).
https://doi.org/10.1007/978-3-319-69404-7 11
11. Lin, J.R., Yang, T.H., Chang, Y.C.: A hub location inventory model for bicycle
sharing system design: formulation and solution. Comput. Ind. Eng. 65(1), 77–86
(2013)
12. Liu, J., et al.: Station site optimization in bike sharing systems. In: 2015 IEEE
International Conference on Data Mining, pp. 883–888. IEEE, November 2015
13. López-Ibáñez, M., Dubois-Lacoste, J., Cáceres, L.P., Birattari, M., Stützle, T.:
The irace package: iterated racing for automatic algorithm configuration. Oper.
Res. Perspect. 3, 43–58 (2016)
14. Megiddot, N., Supowits, K.J.: On the complexity of some common geometric loca-
tion problems. SIAM J. Comput. 13(1), 182–196 (1984)
15. Mladenović, N., Hansen, P.: Variable neighborhood search. Comput. Oper. Res.
24(11), 1097–1100 (1997)
16. Mladenović, N., Brimberg, J., Hansen, P., Moreno-Pérez, J.A.: The p-median prob-
lem: a survey of metaheuristic approaches. Eur. J. Oper. Res. 179(3), 927–939
(2007)
17. Park, C., Sohn, S.Y.: An optimization approach for the placement of bicycle-
sharing stations to reduce short car trips: an application to the city of Seoul.
Transp. Res. Part A: Policy Pract. 105, 154–166 (2017)
18. Reese, J.: Methods for Solving the p-Median Problem: An Annotated Bibliography
(2006)
19. Singhvi, D., et al.: Predicting Bike Usage for New York City’s Bike Sharing System
(2015)
20. Whitaker, R.A.: A Fast algorithm for the greedy interchange for large-scale clus-
tering and median location problems. INFOR: Inf. Syst. Oper. Res. 21(2), 95–108
(1983)