0% found this document useful (0 votes)
38 views12 pages

Harvesting Efficient On-Demand Order Pooling From Skilled Couriers - Enhancing Graph Representation Learning For Refining Real-Time Many-to-One Assignments

Uploaded by

m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views12 pages

Harvesting Efficient On-Demand Order Pooling From Skilled Couriers - Enhancing Graph Representation Learning For Refining Real-Time Many-to-One Assignments

Uploaded by

m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Harvesting Efficient On-Demand Order Pooling from Skilled

Couriers: Enhancing Graph Representation Learning for Refining


Real-time Many-to-One Assignments
Yile Liang Jiuxia Zhao Donghui Li
Meituan Meituan Meituan
Beijing, China Beijing, China Beijing, China
[email protected] [email protected] [email protected]

Jie Feng∗ Chen Zhang† Xuetao Ding


arXiv:2406.14635v1 [cs.AI] 20 Jun 2024

Tsinghua University Tsinghua University Meituan


Beijing, China Beijing, China Beijing, China
[email protected] [email protected] [email protected]

Jinghua Hao Renqing He


Meituan Meituan
Beijing, China Beijing, China
[email protected] [email protected]

ABSTRACT efficiency by 45-55% during noon peak hours, while upholding the
The recent past has witnessed a notable surge in on-demand food timely delivery commitment.
delivery (OFD) services, offering delivery fulfillment within dozens
of minutes after an order is placed. In OFD, pooling multiple or- KEYWORDS
ders for simultaneous delivery in real-time order assignment is a on-demand food delivery, order pooling, many-to-one assignment
pivotal efficiency source, which may in turn extend delivery time. problem, graph representation learning
Constructing high-quality order pooling to harmonize platform ef-
ficiency with the experiences of consumers and couriers, is crucial 1 INTRODUCTION
to OFD platforms. However, the complexity and real-time nature of
order assignment, making extensive calculations impractical, signif-
1.1 Backgrounds
icantly limit the potential for order consolidation. Moreover, offline In recent years, there has been a remarkable upsurge in the wide-
environment is frequently riddled with unknown factors, posing spread adoption of on-demand food delivery (OFD) services world-
challenges for the platform’s perceptibility and pooling decisions. wide. With a mere few clicks, consumers can enjoy delicious meals
Nevertheless, delivery behaviors of skilled couriers (SCs) who without stepping out, all delivered right to their doorstep within
know the environment well, can improve system awareness and just a few dozen minutes. This trend is attributable to the overarch-
effectively inform decisions. Hence a SC delivery network (SCDN) ing shifts in technological innovation, including the popularity of
is constructed, based on an enhanced attributed heterogeneous apps and online platforms, and the growing dependence on third-
network embedding approach tailored for OFD. It aims to extract party services for OFD. Global revenues for OFD sector were about
features from rich temporal and spatial information, and uncover $90 billion in 2018, rose to $294 billion in 2021, and are expected to
the latent potential for order combinations embedded within SC exceed $466 billion by 2026 [16]. Meituan Waimai, China’s pioneer-
trajectories. Accordingly, the vast search space of order assignment ing OFD platform has witnessed remarkable growth over the last
can be effectively pruned through scalable similarity calculations of decade. In 2023, the platform handles over 70 million orders daily,
low-dimensional vectors, making comprehensive and high-quality encompassing an extensive reach across almost 3,000 cities, coun-
pooling outcomes more easily identified in real time. In addition, ties and regions throughout China. 6.24 million couriers earned
the acquired embedding outcomes highlight promising subspaces income via Meituan, with over 1 million actively engaged daily.
embedded within this space, i.e., scale-effect hotspot areas, which In OFD, orders are placed continuously by consumers from var-
can offer significant potential for elevating courier efficiency. ious locations. In response, the platform promptly gathers these
SCDN has now been deployed in Meituan dispatch system. On- newly initiated orders, channels them to merchants, and assigns
line tests reveal that with SCDN, the pooling quality and extent dedicated couriers for pick-up and delivery within the promised
have been greatly improved. And our system can boost couriers’ delivery time. The platforms act as intermediaries, linking a multi-
tude of consumers, merchants and couriers within the ecosystem,
∗ Corresponding author.
† This
and strike a balance between gains and losses among these stake-
work was fulfilled when Chen Zhang interned at Meituan.
holders to achieve sustained growth and prosperity [14]. Among
Conference acronym ’XX, June 03–05, 2018, Woodstock, NY these, consumers desire prompt services, merchants seek to main-
2018. tain food freshness, couriers aim to fulfill enough orders to earn a
Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Yile Liang et al.

decent income in a safe environment, while OFD platforms focus


on boosting efficiency to reduce costs and increase profits.

Figure 1: A courier’s concurrent execution and route se-


quence of four orders.

In this context, couriers often engage in concurrent execution


of multiple delivery tasks, including order pick-up and delivery. A
pivotal efficiency source in OFD is to pool multiple orders
for simultaneous delivery of a single courier in order as- (a) High-quality order pooling.
signment, leveraging shared pick-up and delivery behaviours and
travelling distances, enabling couriers serve more orders within
committed delivery time limits. Facilitating comprehensive order
pooling can effectively reduce delivery costs and enhance OFD
sustainability[23, 24]. Figure 2(a) presents a high-quality order
pooling example, where the courier’s pickup points are highly
concentrated, and the delivery destinations are aligned along a
coherent route, enabling the courier to fulfill the deliveries with
remarkable efficiency. However, unreasonable order pooling may
result in detours and prolonged delivery times, severely undermin-
ing the stakeholders’ experiences. Figure 2(b) illustrates a scenario
in which unreasonable order pooling negatively impacts a courier’s
route, leading to an inefficient delivery trajectory.
In Meituan Waimai, the dispatch system conducts city-level batch
order assignments every 30 seconds[14]. In each dispatch cycle, the
system identifies available couriers for new orders, and assesses
the matching degree (MD) between them, including convenience
of route, over-time risk, and courier acceptance willingness. This
evaluation process demands massive computations for pick-up and (b) Unreasonable order pooling.

delivery route planning (PDRP) to simulate courier’s behaviors


after accepting orders [5]. Subsequently, through the resolution of Figure 2: Order pooling examples.
a multi-objective many(order)-to-one(courier) assignment (MOA)
problem, the system matches orders with the most suitable couriers
to optimize the overall MD scores. (1) Computational complexity in real time. On one hand,
Constructing comprehensive and high-quality order pool- the MD scores based on PDRP outcomes, are non-additive. Specif-
ing in order assignments stands as a key issue for OFD plat- ically, the MD score of assigning multiple orders concurrently to
forms to harmonize platform efficiency with stakeholder experi- a courier, is not equivalent to the sum of the scores of assigning
ence. Practically, there are two primary methods to facilitate com- each order individually to the same courier. Hence, to model the
prehensive and high-quality order pooling in order assignments MOA problem and to obtain sufficient order combination results
during each dispatch cycle. The first approach entails identifying usually demands massive MD score calculations, which suffers from
suitable order combinations among all the pending orders, such combinatorial explosion, as depicted in Figure 3. The MOA problem
as those with shared pick-up/delivery tasks or minimal detours, details can be found in Appendix A. For some big cities in China
aiming to increase the ratio of MOA outcomes. The second ap- during noon peak, there amounts to over 3 thousand orders 1 to
proach focuses on matching orders with couriers whose existing be assigned in each dispatch cycle, while each order can retrieve
assignments can share pick-up/delivery tasks or travel routes with hundreds of couriers available for delivery on average. Assuming
the new orders, thereby optimizing the delivery process. at most 5 orders assigned to a courier, and the average courier can-
didates for a order (combination) is 100, the calculation volume is
1.2 Challenges 1 It
is the order volume in several geographically adjacent areas within a city, not the
However, OFD’s distinct features present considerable challenges. total order volume for the entire city.
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

1
(𝐶 3000 2
+𝐶 3000 3
+𝐶 3000 4
+𝐶 3000 5 ) × 100. On the other hand, the
+𝐶 3000 leverages additional decomposition mechanisms to reduce com-
MOA problem itself is categorized as an NP-hard integer program- putational cost, yet it falls short of enabling real-time application
ming problem, known for its extremely vast search space. Crafting despite notable performance gains. To satisfy the need for solutions
online algorithms that perform effectively for the MOA is an excep- within seconds, XGBoost models are built through supervised learn-
tionally challenging task[3, 15, 35]. Moreover, the fast movement ing on historical order assignment results in [29, 32], to promote
of couriers requires assignment decisions be made within a mere combined order assignments. However, the consolidation results
10 seconds. This imperative time frame ensures the consistency of struggle to break through the constraints of historical decisions,
courier status between the information acquisition phase and the resulting in limited effectiveness.
actual assignment moment.
Consequently, the platform tends to favor one(order)-to-one(courier) 1.4 Motivations
assignments during each dispatch cycle, a strategy that reduces In light of the limitations present in existing work, it’s worth noting
computational volume and complexity, albeit at the expense of that OFD platforms are equipped with a vast fleet of couriers, and
comprehensive order pooling. extensive data on courier behaviors, especially from the skilled ones,
which offer insights for high-efficiency and quality delivery ser-
vices and enhance system intelligence. Skilled couriers (SCs) often
possess a comprehensive grasp of the offline environment, includ-
ing order distribution and road logistics, and continually improve
their delivery skills to adapting to complex conditions. Moreover,
our couriers can reject or transfer system-assigned orders, lever-
aging their expertise to optimize routes, minimizing detours and
overtime. Additionally, the platform gathers courier preferences
for pick-up and delivery locations via their apps, promoting effi-
Figure 3: Calculation volume and search space for modeling cient operations with fewer bottlenecks. Thus SCs’ behaviours
and solving MOA problems in each dispatch cycle. of order selection, route sequence and feedback can provide
the system superior courier-oriented pooling outcomes and
help improve decision quality.
(2) Limited system awareness on the “last mile" offline
In the past decade, the work on word representation learning
environment. In OFD, the "last mile" offline environment is highly
has achieved cutting-edge results [7, 17, 20, 25]. Neural language
intricate and dynamic [34], encompassing unforeseen road closures,
models replace traditional high-dimensional and sparse word vec-
unknown natural obstacles, and pandemic-related lockdowns. OFD
tors with low-dimensional and dense embeddings, which assume
platforms are unable to fully access these extensive, finely-detailed
that frequently co-occurring words share stronger statistical depen-
spatiotemporal data during large-scale decision-making, due to
dencies. Recently, graph representation learning (GRL) methods
insufficient map precision and digital capabilities, along with com-
[4, 13] have increasingly been applied in various fields, including
putational and storage constraints. Consequently, order pooling
e-commerce [6, 8, 28], job search [12, 21], ride-sharing [26, 27],
decisions based on coarse data and limited awareness, may not
to discover diverse types of recommendations on the Web. These
be reasonable, potentially harming courier experiences, causing
approaches have had a major impact in both academia and industry.
delivery delays, and reducing delivery efficiency.
Drawing on prior achievements and the principle that orders
frequently combined together in SCs’ routes tend to yield
1.3 Related Work
top-tier pooling results, this paper aims to using GRL methods
Prior research on order pooling algorithms primarily focused on to uncover the latent potential for order pooling embedded
batching issues in traditional warehouse management [1, 19, 30]. within the SCs’ behaviour data. Therefore, through scalable
However, the more relaxed time constraints of warehouse batching low-dimension vector calculations, instead of massive and time-
algorithms, typically in minutes, or even hours, are not well-suitable consuming PDRP computations, we effectively prune the MOA
for the urgency required in OFD. problem’s search space, shown in Figure 3, meanwhile extract
In recent years, research pertaining to OFD has gradually gained small-scale and isolated subspaces promising for high-quality order
traction. The prevalent method for order pooling batches orders consolidation results, facilitating real-time, effective order pooling.
based on geographical proximity and closeness of their promised
delivery time [22]. However, while these criteria-based batching 1.5 Contributions
rules are straightforward, they limit the scope for consolidation. An
Accordingly, a systemic solution framework, named as SC delivery
exact algorithm for order batching and assignment is proposed in
network (SCDN), is proposed. The novel contributions are:
[31], under the unrealistic assumption of perfect information about
(1) Graph Modelling: We construct a delivery network from
the arrival of orders. The study in [9] produces monthly OFD task
SC route sequences, with flow unit (FU) as nodes linked by SC
groupings offline to facilitate order consolidation, However, their
behavior sequences. An FU is a directed vector from pick-up areas
effectiveness is heavily reliant on order structure stability. Work in
of interest (AOI 2 )[36] to delivery AOI. Orders of an FU share the
[10, 11] achieve order consolidation using iterative clustering on
an order graph, but the batching algorithm’s complexity and com- 2 AOIsare defined as non-overlapping irregular polygons that comprehensively divide
putational load hinder real-time processing. Similar work in [24] and cover the space
Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Yile Liang et al.

same pick-up and delivery AOIs. The network is formulated as an participate in both pick-up and delivery actions during order fulfil-
attributed multiplex heterogeneous network (AMHEN), with ment, there are two kinds of FU sequences: one based on pick-up
FU nodes featuring multiple attributes for temporal and spatial behavior and the other on delivery, as shown in Figure 4. Diverse
information, and links representing two different types of courier couriers’ FU sequences may incorporate some common FUs.
behaviors, namely pick-up and delivery.
(2) Learning Algorithm: Based on GATNE [2], an effective
GRL method for AMHEN, an enhanced attributed heterogeneous
network embedding (EATNE) approach tailored for OFD is derived
to obtain FU embeddings. First, given the fact that couriers move
within a confined region3 in a city, a region-congregated nega-
tive sampling mechanism is proposed as an enhancement over
traditional randomized negative sampling to improve algorithm
performance. Second, we employ a customized margin ranking
loss instead of cross-entropy used by GATNE, aiming to refine
embedding quality. Last, to address dispersed order distribution
and limited FU coverage in SC behaviors, we build a cold start mit-
igation mechanism, using geographic information to generate
embeddings of FUs previously unseen, thus broadening coverage.
(3) MOA Search Space Refinement and OFD Application: Figure 4: Illustration of AMHEN Construction, including 2
Utilizing FU embedding, we reconstruct the order combination and sessions. Session A contains 3 orders for FUs DE, FB and FC.
courier recall mechanisms within Meituan’s dispatch system, facili- The pick-up FU sequence is DE->FC->FB. And the delivery FU
tating superior real-time order pooling. Our use of SCDN refines sequence is DE->FB->FC. Session B follows the same process.
order structure profiles and pinpoints scale-effect hotspots within
MOA’s vast search space, uncovering independent and small-scale To capture shared experiences of SCs, by treating FU as nodes
subspaces for thorough and high-quality order pooling. Accord- and their connections in the FU sequence as links, we can integrate
ingly, an innovative delivery mode is developed to enhance courier all the FU sequences into a unified yet heterogeneous graph. More-
efficiency without compromising service reliability. over, it is crucial to utilize the rich temporal and spatial information
To our knowledge, this is the first application of GRL methods in to enhance learning accuracy, e.g. average historical order amount
achieving real-time order pooling in OFD, now deployed in Meituan and delivery distance of each FU, which makes the above graph an
Waimai’s dispatch system. Online tests shows significant improve- AMHEN . More about node attributes is in Appendix C.
ment in order pooling. The total MD score of the MOA problem is Denote AMHEN by 𝐺 = (𝑉 , 𝐸, 𝐴), where 𝑉 is the FU node set,
improved by 5.3%, indicating more efficient order assignments with 𝐴 is the attribute set for all nodes. FU node 𝑣𝑖 ∈ 𝑉 owns fruitful
reduced detours and overtime risks. The newly-built mode cut the attributes x𝑖 ∈ 𝐴 to describe its crucial characters. 𝐸 = (𝐸 𝑝 , 𝐸𝑑 ) is
average incremental pick-up time for couriers 4 during noon peak the set of edges, which contains two types: pick-up and delivery.
by 51% and delivery time by 21%. These enhancements have led Specifically, there may be two types of edges between the FU nodes
to a 45-55% boost in efficiency, maintaining consistent work hours 𝑝
𝑣𝑖 and 𝑣 𝑗 , where 𝑒𝑖 𝑗 ∈ 𝐸 𝑝 indicates a pick-up edge and 𝑒𝑖𝑑𝑗 ∈ 𝐸𝑑 a
and on-time delivery standards.
delivery one. If two orders, belonging to FU nodes 𝑣𝑖 and 𝑣 𝑗 , are
successively picked up by the same SC, there exists a pick-up edge
2 GRAPH REPRESENTATION LEARNING 𝑝
𝑒𝑖 𝑗 connecting 𝑣𝑖 and 𝑣 𝑗 . Similarly, a delivery edge 𝑒𝑖𝑑𝑗 indicates
APPROACH there exist orders of FU nodes 𝑣𝑖 and 𝑣 𝑗 that are consecutively
In this section, we will detail the step-by-step process by which the delivered by the same SC. Hence, an AMHEN is constructed by
FU embeddings are acquired. merging massive records from tens of thousands of SCs.

2.1 AMHEN Construction 2.2 Graph Representation Learning Model


The AMHEN is constructed based on SC route sequences as de- Treating the AMHEN as input, we apply the model in GATNE [2]
scribed below. The definition of SC and selection criteria of SC to produce node vector representation, i.e. FU embedding, which
route sequences are introduced in Appendix B. can be regarded as the aggregation of various node attributes and
We first divide a SC’s route sequence into distinct sessions, using topology information in the graph, as depicted in Figure 5.
the rest or no action interval as a separator, presently set to 30 min- We divide the whole embedding of node 𝑣𝑖 on each edge type
utes. Then we transform the route sessions into FU sequences via 𝜏 into two parts, base embedding and edge embedding. The base
replacing the orders in the sessions with their FUs. Since couriers embedding b𝑖 is defined as a parameterized function of its attributes
x𝑖 as b𝑖 = h (x𝑖 ), where h is a transformation function, while the
(𝑘 )
k-th level edge embedding u𝑖,𝜏 ∈ R𝑠 , (1 ≤ 𝑘 ≤ 𝐾) of node 𝑣𝑖 on
3A circular area with a diameter of 3-5 km, and the courier’s designated residence as edge type 𝜏 is aggregated from the edge embeddings of neighbors:
the center.
4 defined as the interval between picking up the current order and the preceding one n o
(𝑘 ) (𝑘 −1)
in the courier’s route. u𝑖,𝜏 = aggregator u 𝑗,𝜏 , ∀𝑣 𝑗 ∈ N𝑖,𝜏 , (1)
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

where 𝜏 ∈ {𝑝, 𝑑 } indicates the edge type, 𝑠 is the dimension of edge are constructed by random sampling from pick-up and delivery
embeddings, and N𝑖,𝜏 is the neighbors of node 𝑣𝑖 on edge type 𝜏. FU pairs in the same delivery region but excluding positive pairs,
(0) respectively. In other words, we select k-hop (k>2) neighbors of the
The initial edge embedding u𝑖,𝜏 is parameterized as the function of
(0) FU node that share the same confined region as the challenging
attributes x𝑖 : u𝑖,𝜏 = g𝜏 (x𝑖 ), where g𝜏 is a transformation function.
negative samples to enable the effective training of the proposed
The aggregator function is mean operation in practice. model. Traditional GATNE uses randomized negative sampling, yet
(𝐾 )
We denote the 𝐾-th level edge embedding u𝑖,𝜏 by u𝑖,𝜏 . Then the ignores the regional effects in OFD. We find that the performance
pick-up edge embedding u𝑖,𝑝 and the delivery edge embedding u𝑖,𝑑 of GATNE decreases as the negative sampling scope expands and

of node 𝑣𝑖 are combined as U𝑖 = u𝑖,𝑝 , u𝑖,𝑑 . Given that the pick- the effect becomes almost random as it reaches the city size.
up edge and delivery edge have different impacts,  self attention Margin Ranking Loss. The learning task is to make the repre-
mechanism is used to calculate the weights a𝑖,𝜏 ∈ a𝑖,𝑝 , a𝑖,𝑑 . sentation of positive FU pairs lying nearby in the embedding space,
⊤ and the negative pairs different. However, achieving this with cross-
a𝑖,𝜏 = softmax w𝜏⊤ tanh (W𝜏 U𝑖 ) , (2)
entropy can be challenging. Therefore, a customized optimization
where w𝜏 ∈ R𝑑𝑎 , W𝜏 ∈ R𝑑𝑎 ×𝑠 are trainable parameters for edge objective based on margin ranking loss is proposed to maximize
type 𝜏. Thus, the overall embedding of node 𝑣𝑖 for pick-up edge the distance between positive and negative samples in Equation
v𝑖,𝑝 and delivery edge v𝑖,𝑑 can be computed as: 5, where 𝛾𝑝𝑃 , 𝛾𝑑𝑃 , 𝛾𝑝𝑁 and 𝛾𝑑𝑁 are hyperparmeters representing the
weights of various data sets, 𝑚𝑝 and 𝑚𝑑 are the minimum distance
v𝑖,𝑝 = h (x𝑖 ) + 𝛼𝑝 a𝑖,𝑝 M𝑝⊤ u𝑖,𝑝 + 𝛽𝑝 g𝑝 x𝑖 , (3)
between negative pairs for pick-up and delivery, and cos represents
v𝑖,𝑑 = h (x𝑖 ) + 𝛼𝑑 a𝑖,𝑑 M𝑑⊤ u𝑖,𝑑 + 𝛽𝑑 g𝑑 x𝑖 , (4) the cosine similarity between FU embeddings.
where 𝛼𝑝 and 𝛼𝑑 indicate importance of pick-up and delivery edge
embeddings, respectively, characterizing how pick-up and delivery 𝛾𝑝𝑃 ∑︁
behaviors affect courier efficiency. M𝑝 , M𝑑 ∈ R𝑠 ×𝑑 are trainable 𝐿= (1 − cos(v𝑖,𝑝 , v 𝑗,𝑝 ))
|𝐷𝑝𝑃 |
parameters. 𝛽𝑝 and 𝛽𝑑 control the importance of node attributes. (𝑣𝑖 ,𝑣 𝑗 ) ∈𝐷𝑝𝑃
The FU embedding v𝑖 is the average of v𝑖,𝑝 and v𝑖,𝑑 . The detailed 𝛾𝑑𝑃 ∑︁
implementation of EATNE can be found in Appendix D. + (1 − cos(v𝑖,𝑑 , v 𝑗,𝑑 ))
|𝐷𝑑𝑃 |
(𝑣𝑖 ,𝑣 𝑗 ) ∈𝐷𝑑𝑃

𝛾𝑝𝑁 ∑︁ 
+ max 0, cos(v𝑖,𝑝 , v 𝑗,𝑝 ) − 𝑚𝑝
|𝐷𝑝𝑁 |
(𝑣𝑖 ,𝑣 𝑗 ) ∈𝐷𝑝𝑁

𝛾𝑑𝑁 ∑︁  
+ max 0, cos(v𝑖,𝑑 , v 𝑗,𝑑 ) − 𝑚𝑑 , (5)
|𝐷𝑑𝑁 |
(𝑣𝑖 ,𝑣 𝑗 ) ∈𝐷𝑑𝑁

2.4 Embedding Coverage Improvement


SC behaviors cover only 60% FUs. To compensate for the loss, we
construct an extended delivery network based on geographical
adjacency, shown in Figure 6. The criterion for judging spatial
Figure 5: Illustration of the GRL Model.
adjacency between FUs is the pick-up AOIs should be same 5 and the
distance between delivery point is less than a threshold (currently
1km). If no adjacent FUs found, we will relax it to only consider
2.3 Model Optimization the same pick-up AOI as a fallback. Then the embeddings of FUs
The positive data for training is generated by a meta-path-based previously unseen, can be estimated by aggregating the embeddings
random walk method and skip-gram model [17]. Given a set of of their existing neighboring FUs in the network constructed above.
pick-up FU sequences 𝑆, supposing that This increases FU embedding coverage to over 80%.
 random walk with length
𝑙 on 𝑆 follows a path 𝑆𝑝 = 𝑣𝑠1 , . . . , 𝑣𝑠𝑙 , the pick-up context of 𝑣𝑠𝑡
𝑆 
is denoted as 𝐶 𝑣𝑠𝑝𝑡 = 𝑣𝑠𝑘 𝑣𝑠𝑘 ∈ 𝑆𝑝 , 𝑘 − 𝑡 |≤ 𝑐, 𝑡 ≠ 𝑘 , where 𝑐 is 3 APPLICATION AND DEPLOYMENT
the size of the sampling window. Thus, given a node 𝑣𝑖 and its all 3.1 Model Deployment
pick-up contexts, we can generate a positive pick-up data set D𝑝𝑃 As introduced above, FU embeddings are learned from SC behav-
of positive pairs (𝑣𝑖 , 𝑣 𝑗 ), which indicates SCs frequently pool the ior data using EATNE. Different models are created for diverse
orders of these FU together. Similarly, we can generate a positive scenarios, like weekday/weekend and peak/idle time, due to their
data set D𝑑𝑃 from the delivery FU sessions. significant differences in order structure. Moreover, to accelerate
Negative Sampling. Since couriers usually move within a con- training in big cities, we use community detection algorithms to
fined region, negative samples from different regions are so easy for
the model to distinguish in the whole training stage which makes 5 Theemphasis on the same pick-up points is due to existing data analysis and courier
the learning inefficient. Therefore, the negative data sets D𝑝𝑁 , D𝑑𝑁 feedback.
Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Yile Liang et al.

3.3 Deployment in Dispatch System


The above information, including FU embeddings, FEI, SEH, are in-
troduced in the system via offline features. As either low-dimensional
vectors or scalars, they are performance-friendly to the real-time
storage of the system. The main system framework is shown in
Figure 7.

Figure 6: Illustration of spatial adjacency relationship in the


extended delivery network.

partition the city network into separate regional groups for parallel
training at regional group level.
The models are trained using 4 weeks of data across the country.
They are trained for less than 2 weeks on 4 NVIDIA Tesla V100 Figure 7: The main execution process of the dispatch system
GPUs with 32GB of memory each, and the models get updated in each dispatch cycle.
every 2 weeks.

3.2 Information Mining 3.3.1 Order Combination and Courier Recall. The MOA problem of
Leveraging the FU embeddings, we’ve created a set of indices. our system is now solved by well-crafted constructive heuristics, i.e.
(1) High-quality pooling probability (HPP) quantifies how imitation learning-enhanced iterated matching algorithm (ILIMA)
well multiple orders can be consolidated together, sharing common [3], since metaheuristic algorithms with in-depth search fail to
pick-up and delivery times and travel distances. Since two FUs that meet the real-time requirements [35]. Meanwhile, a few orders are
consecutively appear in the SC behavior sequence often possess combined in mutually exclusive groups based on the closeness of
the above traits, this metric is calculated by the cosine similarity be- their origins and destinations, as well as promised delivery time,
tween the FU embeddings of these orders, reflecting the frequency before MD score evaluation. However, the real-time performance
of consecutive co-occurrence of the two FUs in SC behavior data. severely restricts the search depth of the algorithm, resulting in
insufficient and suboptimal order pooling.
𝑝𝑖 𝑗 = 𝑐𝑜𝑠 (v𝑖 , v 𝑗 ), ∀𝑖, 𝑗 ∈ 𝑉 (6) With SCDN, we develop scalable mechanisms for courier recall
Orders with high HPP values can be consolidated and assigned to and order combination, which can cut down the MOA search space,
the same courier to achieve efficient delivery. and let us focus our limited computation time on promising areas.
(2) FU efficiency indicator (FEI) measures how much an or- Generally, orders with high HPP are formed as favorable combi-
der in this FU improves efficiency, based on how likely it is to be nations in advance, which can greatly expand the proportion of
combined with orders from other FUs to form an efficient delivery combined orders. Order combinations with low HPP and couriers
sequence. It is calculated by the weighed aggregate of HPPs for the whose on-hand orders mostly share low HPP with the new order
FU and its neighbouring FUs that share same or nearby pick-up or are filtered out. Hence, we can facilitate high-quality order pooling
delivery AOIs. The weights are determined by the order volume of in real time, without obvious increase in score calculation volume
those neighboring FUs. and computation time.
∑︁ Order Combination. Based on HPP, high-quality order combi-
𝜂𝑖 = 𝑝𝑖 𝑗 × 𝑤𝑖 𝑗 , ∀𝑖 ∈ 𝑉 (7)
nations can be identified and incorporated into ILIMA as expanding
𝑗 ∈𝑉𝑖
decision entities rather than single orders. As illustrated in Figure 8,
The higher FEI values, the more likely for the order to be efficiently on one hand, order combinations with very low HPP can be pruned
pooled with other orders, thus improving courier efficiency. FEI to avoid unnecessary score calculation. On the other hand, since
values are normalized at the city level for ease of comparisons. top-tier order combinations found by high HPP should be pooled
(3) Scale-effect hotspot (SEH) for OFD refers to a local net- to the same courier, other combinations containing partial orders,
work of geographically proximate FUs, wherein the marginal cost and conflicting orders themselves can be removed from the search
and time of delivery for couriers fulfilling orders in this network space. It can guide ILIMA to search deeply and effectively without
progressively diminishes, allowing for comprehensive order con- obviously increasing score calculation volume.
solidation within promised delivery time. In accordance, FUs in an Courier Recall. When retrieving available couriers for an order
SEH should have high FEI values, and any pair of FUs in the same (combination), we calculate the average value of HPP between it
SEH exhibit a relatively high HPP. And the total order volume for and the courier’s on-hand orders, to quickly estimate MD between
each SEH should exceed certain criteria. the order (combination) and the courier, instead of time-consuming
 
𝜂𝑖 > 𝑇ℎ𝑟𝑒𝜂 ; score calculations. For the on-hand orders already picked up by the
𝑆 = 𝑖 ∈𝑉 (8)
𝑝𝑖 𝑗 > 𝑇ℎ𝑟𝑒𝑝 , ∀𝑖, 𝑗 ∈ 𝑆 courier, its FU can be considered as the FU starting from the AOI
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

(1) Hourly SEH Identifications. SEHs for certain time periods


in a city are found using binary programming (BP), which cate-
gorizes FUs with high FEI within a specified time period into a
number of mutually exclusive sets. It aims to maximize the aver-
age HPP among FUs within each set, with FU quantities and total
historical order volume in each set as constraints. Practically, in
some mega cities like Beijing, SEHs in peak periods are determined
every 30 minutes to capture the changes in order structure. The BP
problems for SEH identification can be solved via genetic algorithm
[18] within 10 minutes. More information is in Appendix G.

Figure 8: Order combination mechanism pruning MOA


search space using HPP information. For example, for can-
didate orders A, B, C, and combinations AB, AC, BC, if AB
and AC have higher HPP, then only AB, AC, B and C are pre-
served for MD evaluation, while A and BC can be eliminated.

where the courier is currently located and ending at its delivery AOI.
This further helps to prune the MOA search space and reduce real-
time computational pressure while maintaining solution quality, as
shown in Figure 9. Figure 10: Promising MOA search subspace described by SEH.

(2) Real-time Parallel MOA Solutions. Order assignment for


SEH is a scaled-down MOA problem. Given the limited area and
stable order structure for SEH in a certain time period, the behav-
ioral patterns of mode couriers are highly certain, thus simplifying
the MD evaluation. In reality, we evaluate the MD via a weighted
sum of average order increments for pick-up and delivery AOIs
in a courier’s route after the new order acceptance for SEH, in-
stead of time-intensive PDRP calculations to simulate couriers’
routes. Hence we can evaluate the MD between any promising
order combinations and candidate couriers in real time, and solve
the completely-modeled MOA problem for each SEH using a Hill-
Figure 9: Courier recall mechanisms pruning MOA search Climbing heuristic algorithm [33] in parallel, helping to pool orders
space using HPP information. effectively and thoroughly. Orders outside SEHs keep the existing
assignment rules.

The implementation details of order combination and courier 4 EXPERIMENTAL EVALUATION


recall mechanisms can be found in Appendix F.
4.1 Model Performance Evaluation
3.3.2 Highly Efficient Delivery Mode. SEHs identified by SCDN, 4.1.1 Model Learning Performance. Link prediction task is used to
essentially represent small-scale subspaces deeply embedded within evaluate the performance of EATNE, with AUC, F1 score and PR as
the MOA search space, where thorough and high-quality order evaluation criteria. The experiments are conducted on a real-world
pooling outcomes can be found, as shown in Figure 10. Then a new dataset collected from Meituan delivery platform, using a single
delivery mode can be built, wherein a dedicated group of couriers Linux server with NVIDIA Tesla V100 GPU with 32GB memory. The
is assigned to each SEH, as opposed to receiving assignments in the dataset contains 28,000 SC behavior records from 28 days in Beijing,
entire region. Accordingly, the original large-scale MOA problem, China, forming a delivery network with about 70,000 FUs. For each
initially solved within a vast search space shown in Figure 3, can edge type, the test set is generated with 10% randomly chosen
be effectively decomposed into a collection of small-scale MOA positive edges and an equal number of negative edges, selected by
problems, defined within much smaller and independent regional negative sampling. Parameter details are in Appendix E.
subspaces, paving the way for comprehensive and in-depth First we examine the effectiveness of EATNE. Figure 11 shows
real-time searching. This approach serves to continually enhance that the original GATNE is hard to converge in this situation. While
the courier efficiency potential. EATNE, armed with regional negative sampling and margin loss,
In the delivery mode, order assignments for each SEH are con- produces superior outcomes in addition to converging much faster.
ducted as follows: Next the performance of EATNE in various graph configurations is
Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Yile Liang et al.

FU pair where one runs alongside the other, and (4) head-to-tail FU
pair, with the tail one pointing high-order-density AOIs, leading to
less courier empty run time 6 after completing deliveries. Orders in
these FU pairs can be pooled for simultaneous delivery to improve
courier efficiency. Meanwhile, we also identify FU pairs with low
HPP. Figure 13(b) illustrates four cases of this situation, including
(1) FU pair with the same delivery AOI but pick-up AOIs located far
apart, (2) reverse parallel FU pair, (3) FU pair where one FU runs
alongside the other but points a low-order-density AOI, leading to
longer courier empty run time, and (4) head-to-tail FU pair that also
leads to a low-order-density area. These FU pairs are unlikely to be
Figure 11: The convergence curve for different algorithms. efficiently pooled together and may undermine courier efficiency.

investigated. Table 1 shows that optimal performance is achieved


by graphs with pick-up and delivery edges and node attributes,
proving the validity of the proposed ANHEN. Notably, pick-up con-
nections are more important than delivery ones, indicating pick-up
behaviours have a greater effect on courier efficiency. Moreover,
adding node attributes is highly impactful, highlighting order struc- (a) FU pairs with high similarity.
ture’s key role in affecting courier efficiency.

Table 1: Model performance under different graph settings

Node Pick-up Delivery AUC F1 PR


Attr. Edge Edge
✓ ✓ ✓ 0.79 0.72 0.75 (b) FU pairs with low similarity.
✗ ✓ ✓ 0.64 0.60 0.59
✓ ✗ ✓ 0.74 0.69 0.71 Figure 13: FU pair cases in different similarity levels.
✓ ✓ ✗ 0.76 0.71 0.73

4.1.2 FU Embedding Effectiveness. To evaluate the effectiveness of 4.2 Order Combination and Courier Recall
FU embeddings, we examine the training results via the data of the The proposed method, ILIMA + SCDN, is evaluated against the
same district in Beijing. First, by performing DBSCAN clustering current online implementation, which utilizes ILIMA with ruled
on learned embeddings, we evaluate if geographical similarity is batching method, and MNDS, a metaheuristic algorithm used in [3].
encoded. Figure 12, which shows resulting 33 clusters, confirms the Experiments are conducted in a mid-sized Chinese city, involving
FUs from close locations are clustered together in the hidden space. around 500 orders and 2,500 couriers in a dispatch cycle during
noon peak.
The comparison results on both computational cost and solution
quality are presented in Table 2. The ILIMA+SCDN approach en-
hances the total MD score of MOA solutions by 5.3% compared to
ILIMA+Rule method, without incurring a significant increase in
time consumption. However, it lags by 1.2 𝑝𝑝 behind MNDS. De-
spite this, MNDS requires exploration of a much larger search space
and massive PDRP calculations, which takes over 20 seconds on
average, making it unsuitable for online use. Hence, the proposed
method excels at balancing computational time and solution qual-
ity, securing more optimal MOA solutions in real-time. Moreover,
Figure 14 illustrates that the overall combination level grows as
Figure 12: FU embedding clusters of a district in Beijing on the percentage of couriers assigned only one order decreases by
map (left) and after T-SNE (right). 16.3 𝑝𝑝. This shift results in increasing order consolidation. Online
A/B test show that while maitaining delivery experience, couirer
efficiency, i.e.orders completed per hour, is augmented by 3.7%.
Next we demonstrate high-quality pooling potential can be cap-
Table 3 presents the results of offline experiments conducted
tured by FU embedding similarity, i.e. HPP. Figure 13(a) shows four
with varying order volumes. In different order size scenarios, the
cases of FU pairs with high HPP, including (1) FU pair with pick-up
and delivery AOIs located closely, (2) nearby parallel FU pair, (3) 6 Empty run time refers to the empty cruising time before carriers deliver next orders.
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

Table 2: Computation cost and score improvement of MOA.

Method Online PDRP Computation MD Score


Calculations Time Online/s Improvement
ILIMA+Ruled 44,541 5.6 0%
ILIMA+SCDN 48,998 6.9 5.3%
MNDS / / 6.5%

(a) ILIMA+Rule. (b) ILIMA+SCDN. (c) MNDS. Figure 15: SEHs over time, with each image capturing half an
hour. Colors denote different areas, bold lines for internal
SEH FUs, and thin lines for external FUs.
Figure 14: Combination level distribution.

proposed ILIMA+SCDN method significantly enhances the MD been reduced by 51% and delivery time by 21%. These enhancements
score over the existing ILIMA+Ruled method. Regarding PDRP lead to a 45-55% boost in courier efficiency, i.e.orders completed
Calculations, for orders fewer than 400, our proposed ILIMA+SCDN per hour, while maintaining consistent work hours and on-time
method demonstrates lower PDRP Calculations compared to the delivery standards. Figure 16 illustrates the superior performance
ILIMA+Ruled method. Nevertheless, as the order volume escalates, of SEH mode against city average level in noon peak, where each
the computational burden of both methods exhibits nearly linear bar corresponds to the trial performance of a specific courier.
growth, aligning with the online time requirements.

Table 3: MOA results across various order sizes.

Method (0, 200] (200, 400] (400, 600] (600, 800] (800, 1000]
MD Score Improvement
ILIMA+Ruled 0% 0% 0% 0% 0%
ILIMA+SCDN 1.0% 4.0% 4.4% 5.5% 3.7%
MNDS 1.7% 5.3% 5.3% 6.9% 5.6%
PDRP Calculations
ILIMA+Ruled 4,285 21,910 37,323 57,700 79,596
ILIMA+SCDN 4,250 20,589 38,358 65,011 94,292

Figure 16: Courier performance in a SEH mode in noon peak.


4.3 Highly Efficient Delivery Mode
Figure 15 depicts 5 SEHs identified in a specific district of Beijing
during weekday noon peak period (11:00-12:59). In response to fluc-
tuations in order structures, the network configuration of each SEH 5 CONCLUSION
is updated every half hour. On average, each SEH processes about This paper proposed a systemic solution framework, SCDN, based
81 orders every half hour with an average HPP of 0.65, ensuring on an Enhanced GATNE method tailored for OFD, to resolve real-
high order density and strong network connectivity. Moreover, the time OFD order pooling problem. It uncovers the latent poten-
maximum number of orders pending assignment in each cycle is tial for order pooling embedded within SC trajectories, which can
less than 10. By allocating 5 to 8 couriers per SEH, we significantly strengthen system awareness and effectively inform decisions. Ac-
simplify the complexity of MOA solutions for each SEH. cordingly, the vast search space of NP-hard MOA problems in OFD
Taking a SEH in Beijing as an example, online tests show a major is effectively pruned through scalable similarity calculations of
boost in order pooling. During noon peak, a courier can accept over simple vectors. Thus high-quality and comprehensive pooling out-
7 orders at once. And the percentage of SEH couriers picking up comes are found in real time. Moreover, the outcomes highlight
over 5 orders simultaneously in the same AOI has risen by 23.5 pp SEHs for OFD, where highly-efficient delivery modes are built for
compared to past performance. Likewise, the percentage of SEH continuously improving efficiency. SCDN has now been deployed
couriers delivering over 5 orders at once in the same AOI has in- in Meituan. Online tests show it has achieved excellent performance
creased by 20 pp. The average courier incremental pick-up time has and well-acknowledged by all the stakeholders.
Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Yile Liang et al.

REFERENCES Control (CDC). IEEE, 531–538.


[1] Olivier Briant, Hadrien Cambazard, Diego Cattaruzza, Nicolas Catusse, Anne- [24] Michele D Simoni and Matthias Winkenbach. 2023. Crowdsourced on-demand
Laure Ladier, and Maxime Ogier. 2020. An efficient and general approach for the food delivery: An order batching and assignment algorithm. Transportation
joint order batching and picker routing problem. European journal of operational Research Part C: Emerging Technologies 149 (2023), 104055.
research 285, 2 (2020), 497–512. [25] Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei.
[2] Yukuo Cen, Xu Zou, Jianwei Zhang, Hongxia Yang, Jingren Zhou, and Jie Tang. 2015. Line: Large-scale information network embedding. In Proceedings of the
2019. Representation learning for attributed multiplex heterogeneous network. 24th international conference on world wide web. 1067–1077.
In Proceedings of the 25th ACM SIGKDD international conference on knowledge [26] Lei Tang, Zihang Liu, Rongguo Zhang, Zongtao Duan, and Yunji Liang. 2021.
discovery & data mining. 1358–1368. Who Will Travel With Me? Personalized Ranking Using Attributed Network
[3] Jing-Fang Chen, Ling Wang, Hao Ren, Jize Pan, Shengyao Wang, Jie Zheng, and Embedding for Pooling. IEEE Transactions on Intelligent Transportation Systems
Xing Wang. 2022. An imitation learning-enhanced iterated matching algorithm 23, 8 (2021), 12311–12327.
for on-demand food delivery. IEEE Transactions on Intelligent Transportation [27] Lei Tang, Zihang Liu, Yaling Zhao, Zongtao Duan, and Jingchi Jia. 2020. Efficient
Systems 23, 10 (2022), 18603–18619. ridesharing framework for ride-matching via heterogeneous network embedding.
[4] Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu. 2018. A survey on network ACM Transactions on Knowledge Discovery from Data (TKDD) 14, 3 (2020), 1–24.
embedding. IEEE transactions on knowledge and data engineering 31, 5 (2018), [28] Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun
833–852. Lee. 2018. Billion-scale commodity embedding for e-commerce recommendation
[5] Tao Feng, Huan Yan, Huandong Wang, Wenzhen Huang, Yuyang Han, Hongsen in alibaba. In Proceedings of the 24th ACM SIGKDD international conference on
Liao, Jinghua Hao, and Yong Li. 2023. ILRoute: A Graph-based Imitation Learn- knowledge discovery & data mining. 839–848.
ing Method to Unveil Riders’ Routing Strategies in Food Delivery Service. In [29] Xing Wang, Ling Wang, Shengyao Wang, Yang Yu, Jing-fang Chen, and Jie Zheng.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and 2021. Solving online food delivery problem via an effective hybrid algorithm with
Data Mining. 4024–4034. intelligent batching strategy. In International Conference on Intelligent Computing.
[6] Mihajlo Grbovic and Haibin Cheng. 2018. Real-time personalization using em- Springer, 340–354.
beddings for search ranking at airbnb. In Proceedings of the 24th ACM SIGKDD [30] Jianglong Yang, Li Zhou, and Huwei Liu. 2021. Hybrid genetic algorithm-based
international conference on knowledge discovery & data mining. 311–320. optimisation of the batch order picking in a dense mobile rack warehouse. Plos
[7] Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for one 16, 4 (2021), e0249543.
networks. In Proceedings of the 22nd ACM SIGKDD international conference on [31] Baris Yildiz and Martin Savelsbergh. 2019. Provably high-quality solutions for the
Knowledge discovery and data mining. 855–864. meal delivery routing problem. Transportation Science 53, 5 (2019), 1372–1388.
[8] Qingbo Hu, Sihong Xie, Jiawei Zhang, Qiang Zhu, Songtao Guo, and Philip S Yu. [32] Yang Yu, Qingte Zhou, Shenglin Yi, Huanyu Zheng, Shengyao Wang, Jinghua
2016. HeteroSales: Utilizing heterogeneous social networks to identify the next Hao, Renqing He, and Zhizhao Sun. 2021. Delay to group in food delivery system:
enterprise customer. In Proceedings of the 25th International Conference on World A prediction approach. In International Conference on Intelligent Computing.
Wide Web. 41–50. Springer, 540–551.
[9] Shenggong Ji, Yu Zheng, Zhaoyuan Wang, and Tianrui Li. 2019. Alleviating [33] Lingyu Zhang, Tao Hu, Yue Min, Guobin Wu, Junying Zhang, Pengcheng Feng,
users’ pain of waiting: Effective task grouping for online-to-offline food delivery Pinghua Gong, and Jieping Ye. 2017. A taxi order dispatch model based on
services. In The World Wide Web Conference. 773–783. combinatorial optimization. In Proceedings of the 23rd ACM SIGKDD international
[10] Manas Joshi, Arshdeep Singh, Sayan Ranu, Amitabha Bagchi, Priyank Karia, and conference on knowledge discovery and data mining. 2151–2159.
Puneet Kala. 2021. Batching and matching for food delivery in dynamic road [34] Jie Zheng, Ling Wang, Li Wang, Shengyao Wang, Jing-Fang Chen, and Xing
networks. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). Wang. 2022. Solving stochastic online food delivery problem via iterated greedy
IEEE, 2099–2104. algorithm with decomposition-based strategy. IEEE Transactions on Systems, Man,
[11] Manas Joshi, Arshdeep Singh, Sayan Ranu, Amitabha Bagchi, Priyank Karia, and and Cybernetics: Systems 53, 2 (2022), 957–969.
Puneet Kala. 2022. FoodMatch: Batching and Matching for Food Delivery in [35] Qingte Zhou, Huanyu Zheng, Shengyao Wang, Jinghua Hao, Renqing He, Zhizhao
Dynamic Road Networks. ACM Transactions on Spatial Algorithms and Systems Sun, Xing Wang, and Ling Wang. 2020. Two fast heuristics for online order
(TSAS) 8, 1 (2022), 1–25. dispatching. In 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE,
[12] Krishnaram Kenthapadi, Benjamin Le, and Ganesh Venkataraman. 2017. Person- 1–8.
alized job recommendation system at linkedin: Practical challenges and lessons [36] Yida Zhu, Liying Chen, Daping Xiong, Shuiping Chen, Fangxiao Du, Jinghua
learned. In Proceedings of the eleventh ACM conference on recommender systems. Hao, Renqing He, and Zhizhao Sun. 2023. C-AOI: Contour-based Instance Seg-
346–347. mentation for High-Quality Areas-of-Interest in Online Food Delivery Platform.
[13] Shima Khoshraftar and Aijun An. 2022. A survey on graph representation In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and
learning methods. arXiv preprint arXiv:2204.01855 (2022). Data Mining. 5750–5759.
[14] Yile Liang, Donghui Li, Jiuxia Zhao, Xuetao Ding, Huanjia Lian, Jinghua Hao, and
Renqing He. 2023. Enhancing Dynamic On-demand Food Order Dispatching via
Future-informed and Spatial-temporal Extended Decisions. In Proceedings of the
A MANY-TO-ONE ASSIGNMENT PROBLEM AT
32nd ACM International Conference on Information and Knowledge Management. EACH DISPATCH CYCLE
4702–4708.
[15] Vittorio Maniezzo, Thomas Stützle, and Stefan Voß. 2021. Matheuristics. Springer. As shown in Figure 3, the calculation volume increases very fast
[16] Eva-Marie Meemken, Marc F Bellemare, Thomas Reardon, and Carolina M Vargas. with the number of orders and couriers. Different order combina-
2022. Research and policy for the food-delivery revolution. Science 377, 6608
(2022), 810–813. tions of order set 𝑂𝑡 are considered. For example, the number of
[17] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. 𝑙−order combinations is 𝐶 𝑙|𝑂 | . Since the MD score of assigning
𝑡
Distributed representations of words and phrases and their compositionality.
Advances in neural information processing systems 26 (2013). combinations of orders is not equivalent to the sum of scores of
[18] Seyedali Mirjalili and Seyedali Mirjalili. 2019. Genetic algorithm. Evolutionary individual assignments. The calculation volume of MD score is
Algorithms and Neural Networks: Theory and Applications (2019), 43–55. Í 𝑙 𝑜¯𝑙 𝑜¯𝑙
[19] Eduardo G Pardo, Sergio Gil-Borrás, Antonio Alonso-Ayuso, and Abraham Duarte. 𝑙 ∈𝐿 𝐶 |𝑂 | × | ∪ ∀𝑜¯𝑙 𝑅𝑡 |, where 𝑅𝑡 is the set of couriers for 𝑙−order
𝑡
2023. Order Batching Problems: taxonomy and literature review. European Journal combination 𝑜¯𝑙 at dispatch time 𝑡.
of Operational Research (2023).
[20] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning
of social representations. In Proceedings of the 20th ACM SIGKDD international
𝑔,𝑜¯ ª
∑︁ ∑︁ © ∑︁ 𝑔
conference on Knowledge discovery and data mining. 701–710. min ­ 𝜂𝑡 × 𝑓𝑡,𝑟 ® × 𝑥𝑟𝑜¯
[21] Rohan Ramanath, Hakan Inan, Gungor Polatkan, Bo Hu, Qi Guo, Cagri Ozcaglar, 𝑥𝑡 ∈ℵ𝑡
Xianren Wu, Krishnaram Kenthapadi, and Sahin Cem Geyik. 2018. Towards deep 𝑜¯ ∈𝑐𝑜𝑚𝑏 (𝑂𝑡 ) 𝑟 ∈𝑅𝑡𝑜 «𝑔∈𝐺 ¬
and representation learning for talent search at linkedin. In Proceedings of the ∑︁ ∑︁
𝑜¯ (𝑜 )
27th ACM international conference on information and knowledge management.


 𝑥𝑟 = 1, ∀𝑜 ∈ O𝑡 
 (9)
 
2253–2261.  𝑜 (𝑜 ) 𝑟 ∈𝑅𝑜 (𝑜 )

 


[22] Damian Reyes, Alan Erera, Martin Savelsbergh, Sagar Sahasrabudhe, and Ryan 𝑠.𝑡 .ℵ𝑡 = Ö
𝑡
O’Neil. 2018. The meal delivery routing problem. Optimization Online 6571  𝑜¯
𝑥𝑟𝑜 , ∀¯

(2018). 𝑥𝑟 =

 𝑜 ∈ 𝑐𝑜𝑚𝑏 (𝑂𝑡 )  

 
[23] Akhil Shetty, Junjie Qin, Kameshwar Poolla, and Pravin Varaiya. 2022. The Value  𝑜 ∈𝑜¯ 
of Pooling in Last-Mile Delivery. In 2022 IEEE 61st Conference on Decision and
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

Algorithm 1 EATNE for OFD (1) average order volume of FU, and the corresponding pick-up
Input: Network 𝐺; Embedding dimension 𝑑; Edge embedding and delivery AOIs in the scenario for last 30 days;
dimension 𝑠; Window size 𝑐; Learning rate 𝜂; Marigin loss min (2) average meal-waiting and pick-up time duration of the corre-
distance 𝑚𝑝 , 𝑚𝑑 ; coefficient 𝛼, 𝛽, 𝛾𝑝𝑃 , 𝛾𝑑𝑃 , 𝛾𝑝𝐷 , 𝛾𝑑𝐷 . sponding pick-up AOI in the scenario for last 30 days ;
Output: Embedding vi , and Embeddding v𝑖,𝑝 and v𝑖,𝑑 on the (3) average delivery time duration of the corresponding delivery
pick-up and delivery edge for all 𝑣𝑖 ∈ 𝑉 . AOI in the scenario for last 30 days ;
1: Initialize all the model parameters 𝜃 . (4) average delivery distance of the FU;
2: Generate positive data sets D𝑝𝑃 and D 𝑃 by random walk on (5) average FU delivery period of time since consumers order in
𝑑 the scenario for last 30 days;
the pick-up and delivery edge, respectively.
3: Randomly sample FU pairs within the same delivery region,
(6) type and number of natural barriers (e.g. bridge, river, high-
then add to negative data set D𝑝𝑁 and D𝑑𝑁 . way) along the FU path;
4: while not converged do
(7) latitudes and longitudes of the center points of the corre-
sponding pick-up and delivery AOIs;
5: for each FU pair in D𝑝𝑃 , D𝑑𝑃 do
(8) the proportion of SCs who chose the corresponding pick-up
6: Calculate v𝑖,𝑝 and v𝑖,𝑑 using Equation (4) and (5) respec-
and delivery AOIs as their preferred locations for the scenario in
tively;
the past 30 days.
7: Sample 𝑚 negative samples and calculate loss value
using Equation (6).
D IMPLEMENTATION OF EATNE ALGORITHM
8: Update model parameters 𝜃 by 𝜕𝐸 𝜕𝜃 .
9: end for The proposed EATNE algorithm is summarized in Algorithm 1.
10: end while
E EATNE MODEL PARAMETER
11: Set vi as the average of v𝑖,𝑝 and v𝑖,𝑑 .
CONFIGURATION
The detailed parameter setting is shown in Table 4. We employ
After getting all these MD scores, the MOA problem can be for- the Adam optimizer with default settings for training. The model
mulated into an integer programming problem in Equation (9). The implements early stopping if there’s no improvement in the ROC-
objective function is to minimize the total MD scores for different AUC on the validation set within a single training epoch.
𝑔,𝑜¯
goals, and 𝑓𝑡,𝑟 is the MD score of assigning order combination 𝑜¯
to courier 𝑟 at time 𝑡 for goal 𝑔, 𝑐𝑜𝑚𝑏 (𝑂𝑡 ) refers to all the possible Table 4: Parameter configuration of EATNE model.
𝑔
combinations constructed by orders in 𝑂𝑡 , 𝜂𝑡 is the weight of goal
𝑔 in the objective function at time 𝑡. The constraint is to make sure Notation Description Setting Value
each combination 𝑜¯ can only be assigned to one courier and only
𝑑 base embedding dimension 200
one combination of each order can be selected. 𝑜¯ (𝑜) represents the
𝑠 edge embedding dimension 20
order combination containing order 𝑜.
𝑙 random walk length 10
B DEFINITION OF SKILLED COURIER AND 𝑐 sampling window size 3
𝑚𝑝 , 𝑚𝑑 margin loss min distance 0.3
SELECTION CRITERIA OF ROUTE SESSIONS 𝜂 learning rate 0.001
As mentioned above, SC refers to the couriers with relatively high 𝛼𝑝 , 𝛼𝑑 ,𝛽𝑝 ,𝛽𝑞 edge weights 1
efficiency, currently set top rank 5%-35% in a delivery region. It 𝛾𝑝𝑃 , 𝛾𝑑𝑃 , 𝛾𝑝𝐷 , 𝛾𝑑𝐷 weights in loss objective 1
should be noted that in order to prevent extreme cases from affect-
ing the validity of the learning outcomes, the top 5% of couriers
have been excluded.
The SC route sessions of both pick-up and delivery type, for con- F IMPLEMENTATION DETAILS OF ORDER
structing the network are selected based on the following criteria: COMBINATION AND COURIER RECALL.
(1) time interval between the execution of two consecutive orders The MOA problem in our system is now solved using a constructive
less than 30 minutes; heuristic framework. The process during each dispatch cycle may
(2) no overtime orders; require multiple iterations. Let 𝑂 𝑘 denote the set of pending orders
(3) no speeding behaviours; during iteration 𝑘, with 𝑂 0 = 𝑂 initially, where 𝑂 represents all
(4) no orders with negative feedback reported. pending orders during this dispatch cycle. And at iteration 𝑘,
Then based on the carefully selected sessions of SCs, we construct (1) Evaluation stage: For the pending orders 𝑂 𝑘 and their as-
the corresponding AMHEN using the method outlined in Section 2. sociated recalled courier candidates 𝑅𝑜𝑘 , 𝑜 ∈ 𝑂 𝑘 , MD scores
{{𝑓𝑜𝑟 }𝑟 ∈𝑅𝑘 }𝑜 ∈𝑂 𝑘 are calculated.
C FU NODE ATTRIBUTES IN AMHEN 𝑜
(2) Matching stage: Based on current MD scores, a one(order)-
We incorporate rich spatial and temporal information as attributes to-one(courier) assignment decision is made following greedy
of a FU node, for a specific scenario (i.e., weekday/weekend, peak/i- policy (aiming to optimize the sum of MD scores for all
dle time), mainly including: matching relations at the current iteration). This may result
in only a subset 𝑂 𝑘 being successfully assigned.
Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Yile Liang et al.

(3) Termination condition: Denote the remaining unassigned If either 𝑓e𝑜𝑟1 or 𝑓e𝑜𝑟2 is lower than threshold 𝑃2 , courier 𝑟 will
𝑘 𝑘
orders as 𝑂 . If 𝑂 = ∅, stop the iterations. Otherwise, up- be removed from the candidate set, i.e. 𝑅𝑐𝑘 = 𝑅𝑐𝑘 − {𝑟 }.
date the state of couriers by including newly assigned orders, (3) For orders in 𝑂 𝑘 and combinations in 𝐶c𝑘 , calculate the MD
𝑘
let 𝑂 𝑘+1 = 𝑂 , 𝑘 = 𝑘 + 1, proceed to Step (1). scores with their refined couriers.
The above process is illustrated as in Figure 9. And in practice, 𝑃2
F.1 Order Combination Mechanism. is set to 0.5.
Although the above algorithm has good performance in solving,
it tends to promote one-to-one assignment results, which is not G SEH IDENTIFICATION APPROACH
conducive to sufficient order pooling. To facilitate many-to-one as- We utilize BP to identify SEHs during each time interval from FUs
signments, high-quality and mutually exclusive order combinations with high FEI in a city or nearby areas. In this section, we introduce
are identified based on HPP, and incorporated into the algorithm as the variable definitions, objective function, and constraints of the
expanding entities rather than single orders. The evaluation stage model.
𝑔
at iteration 𝑘 is executed as follows: The decision variable 𝑥 𝑓 represents whether FU 𝑓 belongs to
(1) For pending orders 𝑂 𝑘 , calculate the HPPs between the FUs SEH 𝑔. To calculate the average HPP in each SEH, we introduce
𝑔
of any two orders and denote the combination set as 𝐶 𝑘 . Set a binary auxiliary variable 𝑦 𝑓 ,𝑓 ′ , which indicates whether FU 𝑓
the order combination set preserved for MD evaluation at and 𝑓 ′ belong to SEH 𝑔 simultaneously. The objective function in
iteration 𝑘 as 𝐶c𝑘 = ∅. Equation (10) is to maximize the average HPP in each SEH, where
(2) Prune two-order combinations with low HPPs (𝑝𝑜 1 ,𝑜 2 < 𝑃1 ), 𝑝 𝑓 ,𝑓 ′ is the HPP between FU 𝑓 and 𝑓 ′ .
i.e., let 𝐶 𝑘 = 𝐶 𝑘 − 𝐶𝑙𝑜𝑤
𝑘 . The constraint in Equation (11) limits each FU to appear in only
one SEH. Equation (12) limits the minimum and maximum number
(3) Repeat this step until 𝐶 𝑘 = ∅: pickup 𝑐 = {𝑜 1, 𝑜 2 } ∈ 𝐶 𝑘
of FUs in each SEH. Equation (13) limits the minimum number
with the highest HPP value, let 𝐶c𝑘 = 𝐶c𝑘 + {𝑐}. Then remove of orders in each SEH, where 𝑛 𝑓 is the number of orders of FU 𝑓 .
its related entries in 𝐶 𝑘 and 𝑂 𝑘 , i.e, let 𝐶 𝑘 = 𝐶 𝑘 − {𝑐 |𝑜 1 ∈ 𝑔
Equation (14) and Equation (15) ensure that 𝑦 𝑓 ,𝑓 ′ = 1 if and only if
𝑐 or 𝑜 2 ∈ 𝑐, 𝑐 ∈ 𝐶 𝑘 }, 𝑂 𝑘 = 𝑂 𝑘 − {𝑜 1 } − {𝑜 2 }. 𝑔 𝑔
𝑥 𝑓 = 𝑥 𝑓 ′ = 1. Equation (16) constrains the minimum average HPP
(4) Use 𝐶c𝑘 and 𝑂 𝑘 as decision entities and calculate the MD
scores with their associated couriers. in each SEH 𝑔. Equation (17) and Equation (18) ensure that all the
decision variables are binary.
The above process is illustrated in Figure 8. And in practice, 𝑃1 is
set to 0.6. Í Í 𝑔
∑︁ 𝑓 ∈𝐹 ′
𝑓 ∈𝐹,𝑓 ≠𝑓
′ 𝑝 𝑓 ,𝑓 ′ × 𝑦 ′
𝑓 ,𝑓
F.2 Courier Recall Mechanism. max Í Í 𝑔 (10)
𝑓 ∈𝐹 ′ ′ 𝑦 ′
𝑔∈𝐺 𝑓 ∈𝐹,𝑓 ≠𝑓 𝑓 ,𝑓
To reduce MD score calculation volume, we can further refine the ∑︁ 𝑔
courier candidates recalled for each order/order combination using 𝑠.𝑡 . 𝑥 𝑓 = 1, ∀𝑓 ∈ 𝐹 (11)
HPP. For the evaluation stage at iteration 𝑘 , the pending entity 𝑔∈𝐺¯
∑︁ 𝑔
sets are 𝐶c𝑘 and 𝑂 𝑘 , and the courier recall mechanism is executed |𝑔| min ≤ 𝑥 𝑓 ≤ |𝑔| max ,∀𝑔 ∈ 𝐺 (12)
as follows: 𝑓 ∈𝐹
(1) For 𝑜 ∈ 𝑂 𝑘 , denote the corresponding courier candidate set ∑︁ 𝑔
𝑛 𝑓 × 𝑥 𝑓 ≥ 𝑁 , ∀𝑔 ∈ 𝐺 (13)
as 𝑅𝑜𝑘 . For 𝑟 ∈ 𝑅𝑜𝑘 , if the on-hand order set 𝑂𝑟𝑘 ≠ ∅, calculate
𝑓 ∈𝐹
the average HPP of 𝑜 and orders in 𝑂𝑟𝑘 as an estimation of 𝑔 𝑔 𝑔
MD score, i.e. 𝑓e𝑜𝑟 = 1𝑘 𝑜 ′ ∈𝑂 𝑘 𝑝𝑜,𝑜 ′ .
Í 𝑦 ′ ≥ 𝑥𝑓 + 𝑥 ′ − 1 (14)
𝑓 ,𝑓 𝑓
|𝑂𝑟 | 𝑟
𝑔 𝑔 𝑔 𝑔
For the on-hand order already picked up by courier 𝑟 , its FU 𝑦 ′ ≤ 𝑥𝑓 , 𝑦 ′ ≤ 𝑥 ′ (15)
𝑓 ,𝑓 𝑓 ,𝑓 𝑓
can be considered as the FU starting from the AOI where the ∑︁ ∑︁ 𝑔
courier is currently located and ending at its delivery AOI. (𝑝 𝑓 ,𝑓 ′ − 𝑃) × 𝑦 ′ ≥ 0, ∀𝑔 ∈ 𝐺 (16)
′ ′
𝑓 ,𝑓
For the on-hands whose FU embedding is absent, the associ- 𝑓 ∈𝐹 𝑓 ∈𝐹,𝑓 ≠𝑓
ated HPP is set as 0. 𝑔
𝑥 𝑓 ∈ {0, 1}, ∀𝑓 ∈ 𝐹, ∀𝑔 ∈𝐺 (17)
If 𝑓e𝑜𝑟 is lower than threshold 𝑃2 , courier 𝑟 will be removed ′ ′
𝑔
from the candidate set, i.e. 𝑅𝑜𝑘 = 𝑅𝑜𝑘 − {𝑟 }. 𝑦 ′ ∈ {0, 1}, ∀𝑓 , 𝑓 ∈ 𝐹, 𝑓 ≠ 𝑓 , ∀𝑔 ∈ 𝐺 (18)
𝑓 ,𝑓
(2) For 𝑐 = {𝑜 1, 𝑜 2 } ∈ 𝐶c𝑘 , denote the corresponding courier
candidate set as 𝑅𝑐𝑘 , which is the intersection of courier can-
didate sets of 𝑜 1 and 𝑜 2 . For 𝑟 ∈ 𝑅𝑐𝑘 , calculate the average
HPP of 𝑜 1 and 𝑜 2 as Step (1), respectively.

You might also like