0% found this document useful (0 votes)
23 views8 pages

SPRI: Aligning Large Language Models With Context-Situated Principles

This document discusses the development of an AI-driven Digital Twin (DT) for Healthcare Intelligent Transportation Systems (HITS) to improve the real-time tracking of emergency vehicles, specifically ambulances. The study integrates Support Vector Regression (SVR) and Deep Neural Networks (DNN) to predict ambulance locations, addressing the synchronization delays between physical and virtual systems, achieving a significant enhancement in real-time representation accuracy. The proposed methodology demonstrates a reduction in synchronization delays by approximately 88% to 93%, showcasing the potential of AI in optimizing healthcare transportation systems.

Uploaded by

heavenlyzoro22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views8 pages

SPRI: Aligning Large Language Models With Context-Situated Principles

This document discusses the development of an AI-driven Digital Twin (DT) for Healthcare Intelligent Transportation Systems (HITS) to improve the real-time tracking of emergency vehicles, specifically ambulances. The study integrates Support Vector Regression (SVR) and Deep Neural Networks (DNN) to predict ambulance locations, addressing the synchronization delays between physical and virtual systems, achieving a significant enhancement in real-time representation accuracy. The proposed methodology demonstrates a reduction in synchronization delays by approximately 88% to 93%, showcasing the potential of AI in optimizing healthcare transportation systems.

Uploaded by

heavenlyzoro22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Accurate AI-Driven Emergency Vehicle Location

Tracking in Healthcare ITS Digital Twin


Sarah Al-Shareeda∗†§ , Yasar Celik∗ , Bilge Bilgili∗ , Ahmed Al-Dubai‡ , and Berk Canberk‡
∗ Departmentof AI and Data Engineering, Istanbul Technical University, Turkey
† BTS Group, Turkey
§ Center for Automotive Research (CAR), The Ohio State University, USA
‡ School of Computing, Engineering and The Built Environment, Edinburgh Napier University, UK

Email: {alshareeda, celiky20, bilgili21}@itu.edu.tr and {a.al-dubai, b.canberk}@napier.ac.uk


arXiv:2502.03396v1 [cs.LG] 5 Feb 2025

Abstract—Creating a Digital Twin (DT) for Healthcare Intel- before receiving these responses, underscoring the challenge
ligent Transportation Systems (HITS) is a hot research trend of achieving true real-time synchronization. This temporal
focusing on enhancing HITS management, particularly in emer-
misalignment is exemplified in Fig. 1, where the physical HITS
gencies where ambulance vehicles must arrive at the crash scene
on time and track their real-time location is crucial to the records data at time (say T0 1 ), while the DT lags significantly
medical authorities. Despite the claim of real-time representation, at delayed timestamp T0+++ . Consequently, the response is
a temporal misalignment persists between the physical and virtual returned at a very late timestamp T0+++++ . This discrepancy
domains, leading to discrepancies in the ambulance’s location rep- between real-time data acquisition and DT response generation
resentation. This study proposes integrating AI predictive models,
requires a dire solution. This research problem underscores
specifically Support Vector Regression (SVR) and Deep Neural
Networks (DNN), within a constructed mock DT data pipeline the critical need to bridge the synchronization gap between
framework to anticipate the medical vehicle’s next location in physical HITS and their DTs, thus advancing the quest for
the virtual world. These models align virtual representations with real-time representation of healthcare transportation systems.
their physical counterparts, i.e., metaphorically offsetting the syn- Numerous studies have addressed the issue of synchronization
chronization delay between the two worlds. Trained meticulously
from various angles [6], [7]. An emerging notable approach
on a historical geospatial dataset, SVR and DNN exhibit excep-
tional prediction accuracy in MATLAB and Python environments. involves using AI-based predictive models within the DT plane,
Through various testing scenarios, we visually demonstrate the allowing predicting medical vehicle locations and behaviors
efficacy of our methodology, showcasing SVR and DNN’s key role ahead of time, effectively mitigating synchronization delays.
in significantly reducing the witnessed gap within the HITS’s DT. A detailed literature review reveals a multitude of endeavors
This transformative approach enhances real-time synchronization
addressing the general challenge of location prediction [8]–[11].
in emergency HITS by approximately 88% to 93%.
Index Terms—Healthcare ITS, Digital Twins, Location Predic- These works collectively showcase the use of AI for vehicle
tion, Artificial Intelligence, Delay Offsetting location prediction, each contributing innovative techniques
and models to improve prediction accuracy, efficiency, and
I. P ROBLEM IN F OCUS adaptability. However, few of these works utilize the concept
In contemporary transportation management, Intelligent of DT, that is, making location prediction in the DT domain;
Healthcare Transportation Systems (HITS) are fundamental to Maheswaran et al. [12] improve autonomous driving systems by
facilitating collaborative information exchange among medical continuously updating the locations of autonomous and human-
vehicles and infrastructure, especially in the case of emergen- piloted vehicles on road segments. To mitigate communication
cies and accidents, and incorporating Digital Twins (DTs) into delays, they incorporate a forecaster algorithm in the twin
HITS promises significant advantages, including real-time data capable of predicting future vehicle locations. Along the same
integration, improved traffic management, and democratizing lines, our present study aims to bridge the gap by contributing
data-driven decision making. DT deployment within HITS to the field of HITS in the following key ways:
can take various forms, including cloud-based, edge-based,
or hybrid-based approaches [1]–[5]. However, achieving real- A. Main Contributions
time synchronization between the physical and virtual facets 1) We address the challenge of the temporal gap in HITS
of HITS remains elusive, resulting in an enduring temporal DT using AI prediction techniques, ensuring a seamless
gap that hinders the attainment of a real-time representation alignment between virtual and actual medical vehicle
of the physical system. For example, in edge-based deploy- locations.
ment, this delay originates from the process by which the 2) We introduce Support Vector Regression (SVR) as our
physical HITS transmits information to the edge server that Machine Learning (ML) model and Deep Neural Networks
hosts the constructed DT. Moreover, this temporal discrepancy (DNN) as the Deep Learning (DL) model. This approach
is observed across the DT’s data, virtual, and service layers.
Ultimately, when final decisions and responses are transmitted 1 Current timestamp T0 , three-times delayed timestamp T0+++ , and five-
to the physical system, the system’s state may have changed times delayed timestamp T0+++++ .
takes advantage of the strengths of both the ML and operated during the first two days of each month throughout
the DL techniques to predict the next location of the 2019. This GPS dataset includes the following feature set:
ambulance accurately. unique vehicle identifiers Vi , i = {1, ..., n}, timestamp, speed
3) A thorough performance analysis of the SVR and DNN (km / h), distance traveled (m), duration of stay at a specific
models is conducted using three key metrics in two sim- location (second), latitude and longitude coordinates of the
ulation environments, MATLAB and Python, proving the current location, that is, (xTVi , yVTi ) and the corresponding next
consistency of the models across different platforms. location coordinates (xTVi+ , yVTi+ ). In particular, this data set
4) We build a mock DT environment as a Proof of Concept comprises 1,048,576 timestamps per month; however, to ensure
(PoC) using Docker [13] and Apache Kafka [14] to sup- data quality and minimize computational resource utilization,
port a real-time HITS data pipeline of actual and predicted we apply rigorous data filtering and restrict our study to a subset
locations. Our DT enables accurate data visualization via of N = 43, 856 time instances. Subsequently, this filtered data
Grafana [15], enhancing synchronization in HITS opera- set is divided into a training subset, which constitutes 80% of
tions. the data, and a validation subset, reserving the remaining 20%.
5) We showcase the efficacy of our models by demonstrating Data preprocessing forms a foundational corner of our model
a significant reduction in the witnessed delay within the development, adhering to rigorous technical standards. In this
HITS DT. This improvement contributes to an improved critical process, a meticulous transformation of our features set
accuracy of real-time synchronization. centers the data around a mean of 0 and scales it to show
Our research advances HITS by reducing DT witnessed a standard deviation of 1. This standardization significantly
delay, improving prediction accuracy, and offering practical improves the convergence rate during model training, providing
solutions for real-time applications in healthcare transportation AI models with a notable understanding of the intricate spatio-
systems. The rest of the paper is organized as follows: the temporal dynamics of our dataset. This centering will also affect
preliminary description of our actual and virtual HITS and the bias of the SVR model b, as explained in a later note. The
the proposed prediction models are presented in Section II. next stage involves feeding the data into our selected prediction
In Section III, the performance of the proposed models is models, enabling us to anticipate vehicle movements accurately.
evaluated, the DT is constructed, and the effect of prediction In this context, we choose the SVR and DNN models, as SVR
on compensating for the observed delay is discussed. Finally, makes precise predictions. At the same time, DNNs can discern
the paper is concluded with key extensions in Section IV. intricate spatio-temporal patterns, resulting in highly accurate
and dependable forecasts of future vehicle locations. We dive
II. P ROPOSED S OLUTION : P REDICTING THE NEXT into a detailed exploration of these two models in the following.
LOCATION OF THE AMBULANCE IN HITS DT TO OFFSET
THE OBSERVED DELAY B. Utilized AI Prediction Models Description
Our HITS comprises a fleet of n emergency service ve- 1) SVR ML Predictive Model: In the context of our dataset,
hicles, as depicted in Fig. 1, each identified by the index the main objective of regression is to uncover how the changes
Vi , i = {1, ...n}. These vehicles are equipped with real-time in the six input features relate to the changes in the next
location tracking systems that provide coordinates (xTvi , yvTi ) location coordinates (xTVi+ , yVTi+ ) of the vehicle. SVR is our
at time T . Communication of status occurs through vehicle- dataset’s regression method of choice as it can effectively model
to-vehicle (V2V) and vehicle-to-structure (V2I) channels, with the sought nonlinear relationship. SVR is an extension of the
real-time situational information transmitted to the Road Side Support Vector Machine (SVM) algorithm that can capture
Unit (RSU). Despite adjusting for factors such as congestion complex relationships and patterns within the data by finding
and contention in the physical network, an unavoidable commu- the “hyperplane” that best fits the data points within error ε-
nication delay ∆T is observed from the perspective of the RSU. tube region. In our case, this hyperplane is a mathematical
Deploying a virtual twin of the HITS at the RSU’s edge server representation of the relationships between the input features
exacerbates the temporal gap and asynchronization with the and the target x-coordinate xTVi+ and y-coordinate yVTi+ . We
physical system. The DT of each vehicle consistently lags by employ dual SVR models: one dedicated to predicting the
∆τ seconds behind its real-world counterpart. Our primary goal x-coordinate and the other to predict the y-coordinate. For
is to mitigate this observed ∆τ . To address this, we integrate simplicity, our description refers to either coordinates as ŷ or
AI models to predict future vehicle locations within the DT f (x̂) where x̂ = [x̂1 , x̂2 , ..., x̂î , ..., x̂N ] represents the input
framework, aligning the virtual representation with physical dataset that has N = 43, 856 instances of 6-dimension x̂î
reality. Two AI models, SVR as an ML model and DNN as values. The main steps of formulating our SVR, exhibited in
a DL model, operate within the DT layers. The schematic Fig. 3, involve:
representation of the prediction model is displayed in Fig. 2 i. In the input space, we identify the support vectors and the
with its three involved steps detailed below. data points closest to the hyperplane to focus on the most
critical data points when creating the prediction model:
A. Input Dataset and Preprocessing
N
Our dataset comprises a historical geospatial GPS trajectory
X
f (x̂) = (wî .x̂î ) + b ± ε, (1)
dataset sourced from a fleet of n = 221 regular vehicles, which î
Fig. 1: Problem in Focus: Witnessed Synchronization Delay between the Physical and Digital Worlds

a dot product can be avoided by using the kernel trick


concept such that:

(ψ(x̂î ).ψ(x̂î )) = K(x̂î , x̂), î = {1, ..., N }. (5)

v. In the kernel space, from (5), the kernel trick can di-
rectly calculate the similarity between input features by
Fig. 2: Predictive Model Design Structured Steps Within The transforming them into a higher-dimensional space. As
DT Layers the choice of the kernel function influences the flexibility
where wî represents the weight, i.e., the normal to the and performance of the SVR model and as our dataset is
optimal decision hyperplane of (1), and b is the bias or a mix of linear and nonlinear features, we leverage the
the closest distance to the origin of the coordinate system. Gaussian Radial Basis Function (RBF) kernel to simplify
In this step, the slack variables ξî and ξî∗ represent the capturing the nonlinear relationships for prediction. The
allowable margin of error beyond the ε-tube.  for the2 RBF kernel in our SVR is K(x̂î , x̂) =
formula
ii. To make the features of the input linearly separable, a ||x̂ −x̂||
exp − î2σ2 , where ||x̂î − x̂||2 is the squared Eu-
mapping function ψ(x̂î ) is used instead of each x̂î . clidean distance and σ is the width of the kernel’s bell-
iii. In such transformed feature space of ψ(x̂î ), the optimal shaped curve. A smaller σ makes the kernel more lo-
hyperplane f (x̂) can be found by minimizing the following calized, while a larger σ makes it more spread out with
Lagrange Loss primal problem: potentially fewer support vectors; selecting an appropriate
2
||w|| X σ requires cross-validation to find the optimal value for our
L(ŷ, f (x̂)) = +C ξî + ξî∗ , (2) specific dataset. As x̂ has N samples, we would calculate
2
such that C controls the generalization capabilities of the the kernel value for every x̂î . This results in a N × N
predictor, ŷ − f (x̂) ≤ ε + ξî , f (x̂) − ŷ ≤ ε + ξî∗ , and dimension kernel Gram matrix. From such a kernel matrix,
we create a correlation matrix by subtracting the mean
ξî + ξî∗ ≥ 0 for î = {1, ..., N }. As this problem (2) is
of each row and column from the matrix, ensuring that
hard to solve in this primal space of w, the solution is to
the kernel matrix has a zero mean. Next, we divide each
transform the problem to a dual space of αî , αî∗ Lagrange
element of the centered kernel matrix by the product of the
multipliers by letting:
square root of the corresponding diagonal elements. The
wî = (αî − αî∗ ).ψ(xî ), î = {1, ..., N }. (3) correlation matrix provides insights into the relationships
between data points in the higher-dimensional space as
iv. In the dual feature space, the hyperplane is exhibited as:
captured by the kernel function. Elements close to 1
N
X indicate high similarity, values close to −1 indicate anti-
f (x̂) = (αî − αî∗ ).(ψ(x̂î ).ψ(x̂î )), (4) correlation, and values close to 0 indicate low or no
î correlation.
where (ψ(x̂î ).ψ(x̂î )) is the dot product between the two vi. Finally, in the dual space using the kernel trick, the La-
mapping functions. It turns out that the calculation of such grange Loss function can be easily solved with Quadratic
Fig. 3: SVR Predictive Model Formulation

Programming to maximize: data, including current vehicle location, is fed into the input
N,N layer. The model aims to precisely predict the future coordinates
1X of the vehicle (xTVi+ , yVTi+ ). This prediction is an invaluable tool
L(ŷ, f (x̂)) = ((αî − αî∗ ).(αĵ − αĵ∗ ))K(x̂î , x̂ĵ )
2 for alleviating the observed delay ∆τ and improving the overall
î,ĵ
N N efficiency of DT-HITS, as discussed in Section III.
X X
−ε (αî + αî∗ ) + ŷî (αî − αî∗ ),
III. P REDICTION E FFECT AND DT E XHIBITION : A NALYSIS
î î
(6) AND D ISCUSSION
PN
such that 0 ≤ αî , αî∗ ≤ C and î ŷî (αî − αî∗ ) = 0. In this section, our objective is to evaluate the performance
of our SVR and DNN prediction models, providing a detailed
Once this Lagrangian optimization is solved, that is, the
account of their accuracy. We begin with an analysis of model
optimal ε-tube hyperplane f (x̂) that best fits the training data
evaluation metrics, shedding light on the precision and reliabil-
x̂ is found, we can use it to make predictions on new unseen
ity of our models. Next, we showcase the models’ proficiency
data points.
2) DNN DL Predictive Model: The adopted DNN model in capturing underlying patterns in the testing geospatial data
is carefully designed to apprehend intricate spatiotemporal through accurately predicted values aligning with actual testing
patterns inherent in our x̂ dataset. The input layer of our data. We then exhibit the mock DT implementation, showing
model consists of six neurons to receive each N 6-dimensional the original and predicted locations in real time. Furthermore,
training and validation sample. We opt for a model of two we explore how our predictions offset the observed delay
hidden layers with 64 neurons and a Rectified Linear Unit ∆τ , drawing comparisons between the observed improvement.
(ReLU) activation function. The final layer has two neurons, We implemented our models and conducted our analysis and
each assigned to forecasting the x-coordinate and y-coordinate simulations in Python and MATLAB R2022b environments
of an ambulance’s next location. This architectural arrangement on a computer with a 2.8GHz Core i7 processor and 16GB
equips the DNN to learn and generalize the training data memory.
effectively. The model undergoes 1000 training iterations with
a batch size of 32 out of the N sample; 20% of the training A. Predictive Models Validation Accuracy
data serves as a validation subset. These settings balance model Measuring the accuracy and error of our two prediction mod-
complexity and generalization, determined by empirical explo- els is crucial to assessing their effectiveness. In our analysis,
ration. Once the model is finely tuned and trained, real-time we use the following three metrics for the assessment.
• Mean Absolute Error (MAE) calculates the average abso- TABLE I: Accuracy Performance Metrics on Validation Dataset
lute differences between actual and predicted next location
Model Type Environment MAE MSE R2
coordinates as: SVM RBF MATLAB 83.424 15527.804 0.99911
N̂ DNN MATLAB 9.179 17261.584 0.99901
1 X SVM RBF Python 0.0712 0.0265 0.97345
M AE = (|(xTVi+ )î − (x̂TVi+ )î | + |(yVTi+ )î − (ŷVTi+ )î |) DNN Python 0.0105 0.0002 0.99995
N
î=1
(7) environments to ensure cross-platform consistency. MATLAB-
• Mean Squared Error (MSE) squares the differences, plac- visualized results, Fig. 4, reveal that for the larger dataset
ing more weight on larger errors. of Scenario 1, the predicted latitude and longitude responses
are closely aligned with the true responses, though there are

1 X occasional outliers. However, the test results for the smaller
M SE = (((xTVi+ )î −(x̂TVi+ )î )2 +((yVTi+ )î −(ŷVTi+ )î )2 ) dataset in Scenario 2, Fig. 5, indicate a higher error. This
N
î=1 discrepancy suggests potential challenges in the models’ ability
(8)
2 to generalize effectively to datasets with fewer samples, offering
• R-squared (R ) measures how well our model’s predic-
valuable insights into their performance across varying data
tions explain the variability in the actual data. A value of
sizes.
1 indicates a perfect fit, while a value of 0 indicates that
The Python results exhibit a more favorable performance in
the model does not explain any variability:
predicting location than the actual location of the emergency
PN̂
((xTVi+ )î − (x̂TVi+ )î )2 + ((yVTi+ )î − (ŷVTi+ )î )2 vehicle for both test datasets, as shown in Fig. 6. These visual-
R = 1− î=1
2
PN̂ izations suggest a higher accuracy in the Python environment,
T+ T+ 2 T+ T+ 2
î=1 ((xVi )î − x̂Vi ) + ((yVi )î − ŷVi ) showcasing a closer alignment between the predicted and true
(9) responses for the ambulance’s next location. The implication
In the context of the evaluation metrics described, Table I is that, in contrast to the MATLAB results discussed earlier,
offers a comprehensive analysis of the accuracy performance the Python implementation of the models shows superior per-
for our SVR and DNN models in two distinct computational formance in predicting the ambulance’s location, highlighting
environments. MATLAB and Python. Within the MATLAB the programming environment’s importance in influencing the
environment, the SVR RBF model exhibits a moderately models’ effectiveness.
high MAE of 83.424 and MSE of 15527.804, suggesting a
moderate level of predictive accuracy. However, R2 stands C. Showcasing The Prediction in The DT
impressively high at 0.99911, indicating a robust correlation A mock DT environment has been developed on the host
between predicted and actual values. In contrast, the DNN machine2 to emulate the original and predicted ambulance
model in MATLAB outperforms the SVR RBF, achieving a locations. Using the open-source Docker platform for con-
significantly lower MAE of 9.179 but with a higher MSE of tainerization, the DT application and its associated components
17261.584. Despite a slightly less perfect R2 value of 0.99901, and dependencies are encapsulated within virtual containers; a
this underscores the superior predictive accuracy of the SVR Docker container is constructed using a Docker package/image
RBF model and its ability to capture intricate patterns within [13]. The orchestration of these containers is adeptly managed
the validation dataset. SVR RBF and DNN models demon- by Docker-compose, which is crucial for the cohesive operation
strate remarkable accuracy when transitioning to the Python of the DT system. The resultant integrated DT system is
environment. The SVR RBF achieves an MAE of 0.0712 and contained within the Docker-compose framework, as illustrated
an MSE of 0.0265, with a highly recommended value of R2 of in Fig. 7. Three main images are used. For data streaming,
0.97345, indicating a precise alignment with the validation data. Apache Kafka, renowned for its scalability and fault tolerance
The DNN model in Python excels further, achieving a value of capabilities, is set by the ”confluentinc/cp-kafka:latest” image
R2 of 0.99995, accompanied by negligible MAE (0.0105) and [14]. Apache Zookeeper is used through the ”confluentinc/cp-
moderate MSE (0.0002), highlighting its exceptional predictive zookeeper:latest” image to coordinate and synchronize the
accuracy and its ability to capture the underlying patterns Kafka streaming nodes (brokers) to ensure consistent manage-
within the validation data set accurately. Consistently across ment of system coordination [16]. Lastly, the Grafana visualiza-
both environments, the DNN model outperforms the SVM tion, represented by the ”grafana/grafana:latest” image, builds
RBF, with Python implementations yielding superior results. a connection agent to integrate Kafka with the visualization,
The consistently high values R2 across all models affirm their i.e., to send Kafka metrics to the Grafana Cloud instance.
reliability and suitability for predictive tasks in MATLAB and The data pipeline in our DT starts with reading the input data.
Python environments. We used a CSV file with timestamp, latitude, and longitude
B. Prediction Testing Scenarios Results coordinates of the current location (xTVi , yVTi ), the predicted
T+ T+
In evaluating the generalization of the models to unseen data, coordinates of the next corresponding location (xVi , yVi )
two datasets denoted x̂ are used with 40,128 samples (Scenario using SVR and DNN for an ambulance of the 221 vehicles in
1) and 19,252 samples (Scenario 2), each with six-dimensional 2 We build the visualization environment on a virtual machine that runs the
input features. Testing is carried out in MATLAB and Python Ubuntu 22.04 operating system and has 4GB memory.
(a) DNN: Latitude (b) DNN: Longitude (c) SVR: Latitude (d) SVR: Longitude
Fig. 4: DNN and SVR Models Scenario 1’s True vs Predicted Response in MATLAB

(a) DNN: Latitude (b) DNN: Longitude (c) SVR: Latitude (d) SVR: Longitude
Fig. 5: DNN and SVR Models Scenario 2’s True vs Predicted Response in MATLAB

the original data set. A Python script named ”sendStream.py” Algorithm 1 Send data to Kafka topic [18]
parses the CSV file. This script initializes the Kafka Producer, function SEND TO KAFKA TOPIC(input data)
which is constructed using the ”confluent-Kafka” library, a Initialize KafkaProducer
toolset that facilitates the employment of Kafka functionalities for each data point in input data do
KafkaProducer.send(’my-stream-topic’, value=data point)
within the Python environment. The Kafka Producer sends the end for
read data to the Kafka topic, which serves as a fundamental Close KafkaProducer
organizational entity within the Kafka cluster, enabling Produc- end function
ers to dispatch data and Kafka Consumers to retrieve data from
this communication medium. In this setup, the Kafka Producer Algorithm 2 Consume data from Kafka topic [18]
processes each row of the CSV file, translating it into a JSON- while True do
formatted message before sending it to the’my-stream’ topic in function CONSUME FROM KAFKA TOPIC
Initialize KafkaConsumer for ’my-stream-topic’
Kafka as delineated in Algorithm 1. The script calculates the for each message in KafkaConsumer do
temporal interval between successive timestamps to simulate data point = message.value
a live data stream. Consequently, it introduces a delay in Process data point
Store or forward processed data for visualization
transmitting messages, ensuring that the temporal fidelity of end for
the data is maintained. Conversely, the consumer-side Python end function
script, ”processStream.py,” is responsible for instantiating the end while
’my-stream’ topic and the Kafka Consumer. This consumer is D. Witnessed Delay Offset
configured to subscribe to the ’my-stream’ topic and outputs the
To assess the impact of prediction on mitigating the observed
incoming messages to the terminal. Integration with Grafana
delay (∆τ ) between the digital and physical domains, our
is achieved through the use of an Apache Kafka plugin [17],
simulation-based analysis initially examines the communication
which registers the Kafka Broker as a data source within
delay observed from the perspective of the Road Side Unit
Grafana, thereby granting it access to the data streamed to the
(RSU) in the physical network. With an RSU coverage of 1
”my-stream” topic, as explained in Algorithm 2. Our Grafana,
km that accommodates around n = 40 vehicles, the RSU uses
Fig. 8 is configured to create five different visualization panels.
V2I communication through a cellular network with a data
These panels display longitude-latitude charts representing the
transfer rate of 100 Mbps. Furthermore, V2V communication
testing data locations, the predictions from SVR and DNN, a
employs WiFi at 6Mbps, featuring a control channel duration
geomap that plots the data, and a temporal visualization of the
of 46 msec. Specifically, for a safety application that transmits
timestamps.
310 bytes of data, a vehicle beacon rate of 100 msec, and an
application processing time of 2.23 msec, Table II illustrates
a noticeable correlation between the increase in the number
of vehicles n and the increase in delay in the physical HITS
realm. Upon deploying virtual twin prediction models for this
HITS at the RSU’s edge server, improvements in the observed
temporal gap (∆τ ) between the DT and the physical system
are evident, as depicted in Table II, including the average
testing prediction delays of 0.0883 sec for the DNN model and
0.0037 sec for the SVR model. The results in Table II show the
efficacy of using predictive models to bridge the temporal gap
in HITS, especially as the number of vehicles increases. The
DNN and SVR models contribute to substantial ∆τ reductions,
reflecting improved synchronization between the digital and
physical realms.

TABLE II: Witnessed Communication Delay ∆τ (sec): when


no DT is used and when DT with prediction is used
n No DT DT and Prediction Improvement(%)
2 1.657793333 0.196686667 88.13
5 4.144483333 0.347716667 91.61
(a) Scenario 1 10 8.288966667 0.599433333 92.76
15 12.43345 0.85115 93.15
20 16.57793333 1.102866667 93.34
25 20.72241667 1.354583333 93.46
30 24.8669 1.6063 93.54
35 29.01138333 1.858016667 93.59
40 33.15586667 2.109733333 93.63

IV. C ONCLUSION AND K EY E XTENSIONS


Despite the prevalent claim of real-time representation, the
digital counterpart of HITS must always catch up with its
physical world due to synchronization delays ∆τ . Our approach
leveraged AI models to forecast the next-ambulance locations
in the virtual world and align the virtual positions with their
physical twin locations. We built a mock DT environment
using Docker and Kafka to support a real-time data pipeline
of actual and predicted locations; our DT enabled accurate
visualization through Grafana, enhancing synchronization in
(b) Scenario 2 HITS operations. In this DT, both the SVR and DNN models
showcased high prediction accuracy in various testing scenarios
Fig. 6: SVR and DNN Models’ True vs. Predicted Next and visually underscored the effectiveness of our methodology.
Locations and Reported Errors in Python In particular, DNN was the superior model, outperforming
SVR in multiple instances. Significantly, both models played
a transformative role by substantially reducing observed delays
from 1.65 sec in the case of n=2 vehicles to only 0.196 sec and
from 33 sec in the case of n=40 vehicles to only 2.1 sec. These
results highlighted the efficacy of our proposed approach and
emphasized the pivotal role of advanced AI predictive models
in addressing such critical challenges. Future work could focus
on ensemble modeling, combining SVR and DNN for improved
predictions, and exploring hybrid solutions like edge-cloud DT
integration for enhanced real-time processing efficiency.

ACKNOWLEDGMENT
This work was supported by the USDOT UTC CARMEN +
Project and the Turkish Scientific and Technological Research
Council (TUBITAK) 1515 Frontier R&D Laboratories Support
Fig. 7: Mock DT for Data Pipeline Architecture Program for BTS Advanced AI Hub: BTS Autonomous Net-
works and Data Innovation Lab Project 5239903.
Fig. 8: Developed HITS DT: Original vs. Predicted Locations (for a recorded visualization excerpt see [18])

R EFERENCES [12] Muthucumaru Maheswaran, Tianzi Yang, and Salman Memon. A fog
computing framework for autonomous driving assist: Architecture, ex-
periments, and challenges, 2019.
[1] Tom H Luan, Ruhan Liu, Longxiang Gao, Rui Li, and Haibo [13] Docker Inc. https://www.docker.com/. Accessed: 03-20-2024.
Zhou. The paradigm of digital twin communications. arXiv preprint [14] Apache Software Foundation. https://kafka.apache.org/. Accessed: 03-
arXiv:2105.07182, 2021. 20-2024.
[2] Sarah Al-Shareeda, Sema F Oktug, Yusuf Yaslan, Gokhan Yurdakul, [15] Raintank Inc. https://grafana.com/. Accessed: 03-20-2024.
and Berk Canberk. Does twinning vehicular networks enhance their [16] Apache Software Foundation. https://zookeeper.apache.org/. Accessed:
performance in dense areas? arXiv preprint arXiv:2402.10701, 2024. 03-20-2024.
[3] Sarah Al-Shareeda, Fusun Ozguner, and Berk Canberk. Group-signature [17] Grafana Labs. https://grafana.com/grafana/plugins/
authentication to secure task offloading in vehicular edge twin networks. hamedkarbasi93-kafka-datasource/. Accessed: 03-20-2024.
In GLOBECOM 2024 Workshop - BlockSecSDN, page ?, 2025. [18] Yasar Mehmet Celik. Mock digital twin visualization with grafana, 2023.
Accessed: 2024-04-01.
[4] E Muhammad Saim, Sarah Al-Shareeda, Keith Redmill, and Umit Ozgi-
iner. Safety in connected automated vehicles in the presence of vulnerable
road users. 2024.
[5] E Muhammad Saim, Sarah Al-Shareeda, Keith Redmill, Umit Ozgiiner,
et al. Control of automated vehicles in vehicle-pedestrian environment.
2024.
[6] Xiaoqing Yang, Jinkai Zheng, Tom H Luan, Rui Li, Zhou Su, and Mi-
anxiong Dong. Data synchronization for vehicular digital twin network.
In GLOBECOM 2022-2022 IEEE Global Communications Conference,
pages 5795–5800. IEEE, 2022.
[7] Jinkai Zheng, Tom H Luan, Yao Zhang, Rui Li, Yilong Hui, Longxiang
Gao, and Mianxiong Dong. Data synchronization in vehicular digital
twin network: A game theoretic approach. IEEE Transactions on Wireless
Communications, 2023.
[8] Meng Chen, Qingjie Liu, Weiming Huang, Teng Zhang, Yixuan Zuo, and
Xiaohui Yu. Origin-aware location prediction based on historical vehicle
trajectories. ACM Transactions on Intelligent Systems and Technology
(TIST), 13(1):1–18, 2021.
[9] Nishanthi Dasanayaka and Yanming Feng. Analysis of vehicle loca-
tion prediction errors for safety applications in cooperative-intelligent
transportation systems. IEEE Transactions on Intelligent Transportation
Systems, 23(9):15512–15521, 2022.
[10] Farimasadat Miri, Alireza A Namanloo, Allan M De Souza, and
Richard W Pazzi. A novel short-term vehicle location prediction using
temporal graph neural networks. In 2022 IEEE Latin-American Confer-
ence on Communications (LATINCOM), pages 1–6. IEEE, 2022.
[11] Dawen Zheng, Lusheng Wang, Caihong Kai, and Min Peng. Resource
optimization for task offloading with real-time location prediction in
pedestrian-vehicle interaction scenarios. IEEE Transactions on Wireless
Communications, 2023.

You might also like