2022 A Credibility Assessment Approach For Scenario-Based Virtual Testing of Automated Driving Functions
2022 A Credibility Assessment Approach For Scenario-Based Virtual Testing of Automated Driving Functions
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 1
Index Terms—automated driving; software-in-the-loop; scenario-based approach; virtual testing; virtual development; virtual
validation; computer simulation; automotive engineering; intelligent vehicles; automated vehicles
I. I NTRODUCTION credibility
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 2
and can at least partly substitute testing in the real world [14]. is described. It does not only contain the consecutive frames,
For this reason the key question is, how to prove that simulated but also the actions, events and goals executed by all the
scenarios can be used for the validation and assessment of actors and the events that are happening. In [16] the authors
real automated driving vehicles. The ADF is designed and characterize a scenario by five layers in order to simplify
developed for functioning in the real world, but the testing is the structure and the categories of information in a scenario:
intended to be done mostly virtually in the simulation. If the the static environment is defined by three layers. The first
same ADF behaves differently in the simulation than in the layer describes the road geometry and topology, like street
real world, the results in the simulation have no significance. dimensions with its number of driving lanes or the radius
Hence, it is necessary to show that an ADF behaves in a similar of a curve. The second layer adds traffic infrastructure, like
way to its counterpart in the real world when the same software traffic signs or traffic lights an finally the third static layer
versions are used and confronted with the same conditions in adds temporary manipulations of the first two layers like
its surrounding. In this way, it is possible to reproduce the construction sites. The dynamic environment is defined in the
results of a physically tested scenario and to show that errors fourth layer where all objects, e.g. vehicles, pedestrians, etc.,
and failures can be avoided. are included with their respective interactions and maneuvers.
In this work the authors propose a new method for assessing The fifth layer describes all environmental conditions, like
the credibility for virtual scenario-based testing of ADFs. The weather or daytime, with their effects on the first four layers.
idea is to perform a real test drive with an ADF and to
log the bus data of the vehicle. Afterwards, the scenarios B. ASAM Standards
are identified and extracted from the real driving data and In recent years, the Association for Standardization of
resimulated including the static environment and all traffic Automation and Measuring Systems (ASAM) has set many
participants with the ADF connected to the simulation. Finally, standards in the field of automotive simulation [17]. As ASAM
the behavior of the driving function in the virtual world is standards have a broad basis in research and industry, these
analyzed with respect to the one in the real world. Depending standards are solely used in this work in order to ensure the
on whether the function can reproduce its real behavior the applicability of the method to the broad research community.
credibility of the virtual testing can be deduced. The proposed The ASAM OpenDRIVE (ODR)1 simulation standard is
method is a necessary preliminary work to show how repre- used for the description of static elements of a road network.
sentative the results of the virtual scenario-based testing are. The standard specifies the geometry of the road as well
The contributions of this work are: as infrastructural elements that influence its logic, such as
• A method for assessing the credibility of a ADF simula- roadmarks, traffic signs and traffic lights.
tion. ASAM OpenSCENARIO (OSC)2 serves as a simulation
• A set of adapted metrics to quantify and qualify a similar standard for describing the dynamic entities in a traffic simu-
behavior between simulation and real drive. lation. In an OSC file the actions and states of each individual
• A normalized relative credibility assessment for the com- vehicle, pedestrian, further road user and entity can be defined.
parison of different simulation setups. OSC allows a scenario to be described in two ways, (1) based
This article is organized as follows: Section II and Sec- on driving actions (individual maneuvers) which consist of
tion III present background information and a literature review. actions, events, goals and values or (2) based on trajectories
Section IV introduces the new approach of this work subdi- which consist of coordinates and timestamps.
vided into five sections: real test drive, scenario extraction,
simulation environment, simulation execution and credibility C. Definition of Credibility in Simulation
assessment. Section V presents the experimental setup with
There are many scales that try to measure the quality in
its case studies and results. The discussion and interpretation
modeling and simulation. However, the main questions is
of the results follows in Section VI including its limitations.
always if the simulation is credible and therefore can be used
Section VII concludes and presents an outlook on future
in the verification and validation process to ensure whether a
topics.
real test can be replaced by a virtual one or not.
Credibility in modeling and simulation results is defined as
II. BACKGROUND
“the quality to elicit belief or trust” according to the NASA
In this section relevant background information on the standard 7009a [18]. Liu et al. [19] add that credibility has
method presented in Section V is given, beginning from the in always be defined for a predetermined purpose. This also
depth understanding about a scenario used within the scenario- includes the data used and the results obtained from simulation
based simulation approach and the standards applied in the models. Credibility is not directly linked to the level of
case study. Furthermore, details about credibility in simulation, quality in modeling and simulation as common verification
scenario extraction as well as the necessity of determinism are and validation processes are, but it is strongly related to it
provided. [20]. However, metrics are needed to quantitatively evaluate
the credibility assessment which define a threshold for the
A. Definition of Scenario different levels.
According to Ulbrich et al. [15] a scenario is a temporal 1 https://www.asam.net/standards/detail/opendrive/
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 3
D. Scenario Generation known dynamic and static objects and their properties is built.
Scenarios can be generated using different approaches. A The planning step is responsible for the decision-making and
widespread practice in the automotive industry is the gener- trajectory generation. Finally, in the actuation step set values
ation of scenarios by experts based on their existing domain are passed on to actuators or on to a vehicle dynamics model
knowledge [21]. However, these scenarios are generated only like in this case. The various steps typically consist of several
by existing subjective and restricted knowledge. Thus it is compact software modules, which are connected to each other
challenging to prove the validity and completeness of scenarios via diverging and merging data flow.
for validation of the whole system. Therefore, it is a necessity The distributed nature of the SIL poses some threats with
to rely on scenarios that may and have occurred on public respect to determinism, which might affect the repeatability of
roads when taking into consideration the credibility of the the experiments that are performed. The repeatability property
scenarios used for validation. is particularly important, because it forms the foundation for
Automotive companies drive a huge number of testing regression testing and test automation. The high computational
kilometers every day with vehicles at different stages of effort and the sheer number of interconnected software mod-
development [22]. Such testing vehicles record the real driving ules require the ADF to be executed on multiple machines and
data while performing their endurance and performance runs. parts of it even on heterogeneous platforms, such as graphics
This results into thousands of hours of existing real driving cards. Not only it must be ensured that all software modules
data that can be exploited and used to understand what is meet their deadlines, but attention must also be paid to data
happening in real traffic by identifying and extracting traffic determinism. The same applies to the virtual environment.
knowledge, driving scenarios and parameter distributions. The Multiple models (e.g. submicroscopic traffic simulation, sensor
resulting scenarios from real test drives have in fact occurred models, vehicle dynamics models) need to be coupled and
on public roads and therefore are valid and traceable test cases as the simulation is computationally expensive, there might
for the validation of ADFs. be a need to be run on multiple simulation machines. The
However, the data is unlabeled, i.e. it is unknown what kind simulation models and tools must be synchronized with respect
of scenarios happened during the test drive and is associated to the simulation time. Hazards regarding non-deterministic
with the raw data. A scenario is not directly measurable. message loss, message order and message timing need to be
Nonetheless, it can be extracted by an interpretation of mul- prohibited. In [29], [30] these phenomena are explained in
tiple signal patterns and their context. There exist different more detail and synchronization mechanisms are presented,
approaches for the identification of scenarios [23], [24], which that are also used for the scope of this work. Consequently
can be subdivided into different categories: the rule-based the conducted simulation runs in this work produce repeatable
approach starting from a scenario catalog, the supervised results.
learning approach based on ground truth reference data and
the unsupervised approach based on pattern recognition in the III. R ELATED W ORK
data. In this work a simple rule-based algorithm is presented In recent years different approaches for the validation and
and used in the method. verification of ADFs have been published.
In the research project PEGASUS [31] 17 partners from
scientific institutions and industry propose a scenario-based
E. Software-in-the-Loop
verification and validation approach for ADFs. Their central
The development and testing of ADFs in virtual envi- elements are based on the definition of requirements for the
ronments (VEs) is carried out in so-called software-in-the- ADF, data processing methods, a joint and public database, the
loop (SIL) simulation setups. It requires the coupling of assessment of the ADF and finally the safety argumentation.
several simulation tools and models with the ADF, which This project can be seen as a first open step from a distance-
itself also consists of several software components. Fig. 2 based validation approach to a systematic scenario-based one.
illustrates the SIL configuration for automated driving with A focus of the project is the testing of scenarios on a real test
typical components. In general, the core of the simulation is track with the ability to repeat those scenarios by controlling
formed by a submicroscopic traffic [25] simulator, that often the vehicles remotely. In this way it is possible to improve or
contains more than just pure traffic simulation (e.g. vehicle update the ADF and analyze its behavior when confronted with
dynamics, behavior, and sensor models). The submicroscopic the same circumstances. One realization of the project was that
traffic simulator is at times not sufficient for a reasonably simulation will play a big role in the verification and validation
realistic representation of the virtual environment and the process. Therefore, two further research projects SET Level3
coupling of further simulation models might be required [26], and VVM4 arose with a stronger focus on the development of
[27]. The models provide features that the standalone simulator testing methods and simulation based testing.
do not yet include (e.g. network simulation) or they are In the research project SAVe5 and its successor SAVeNoW6
custom-designed models that fulfill their function particularly partners from the industry and scientific institutions develop a
well (e.g. models of specific sensors).
3 https://setlevel.de/projekt
The operation of the ADF can be described by means of the 4 https://www.vvm-projekt.de/en/
sense-model-plan-act (SMPA) [28] methodology. The world is 5 https://save-in.digital/
perceived through sensors, or in this case, sensor models. In 6 https://www.bmvi.de/SharedDocs/DE/Artikel/DG/AVF-
the model step an internal accumulated world model with all projekte/savenow.html
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 4
Simulation Software
Automated Driving Function
Behavior Models
(e.g. Bicycles, Vehicle
Sense
Pedestrians, Dynamics noisy and
Drivers) delayed snapshot
Submicroscopic Traffic Simulation of the world Model
(e.g. road network, traffic control, weather
control)
Plan
Network Sensor Models
Simulation (e.g. Camera, Radar)
Act
methodology for a holistic simulation model of a city and its Integrity Level (ASIL) decomposition, criteria for coexistence
traffic. The simulation model is used for the optimization of of elements of different ASIL classifications in a system, and
the traffic flow and the virtual testing of mobility services and requirements for safety analysis.
ADF. The vision is to ensure a secure release of automated In addition to specific processes for the validation and
vehicles with the help of simulation by having a digital twin verification of ADFs, there are also approaches focused on
of a real digital test field in order to compare real test drive the simulation-based validation in general.
results with its virtual twins. Sargent [35],[36] describes the challenge of validation and
In [32] the authors propose different aspects to consider in verification of simulation models. There are different ways
order to improve validation effectiveness compared to a brute- to evaluate a model and different validation approaches and
force approach: they mention that behavioral requirements techniques. However, the author states that for each single
must be identified before testing the correctness in order to problem, which should be solved with the simulation, a
be able to provide pass and fail criteria. Even if the testing separate and appropriate model is needed. Therefore, there
on the vehicle level finds no failures at all, that does not is no fixed set of methods or practices for the verification
mean the ADF is necessarily safe. They state also potential of simulation models because every model poses its own
solutions in order to improve the safety without the need to unique challenge. Some of the presented techniques, e.g.
validate any hypothetical scenario: for instance by providing animation, comparison to other models, event validity or
the external guarantee that the vehicle will not encounter a parameter variability-sensitivity analysis are also considered
scenario it cannot handle. Alternatively, the ADF may contain in this work.
the capability to detect that it is in a situation outside of its In [37] the authors propose an approach for appropriate
operational domain and put the vehicle to a safe state. scenario selection and model validation for the virtual testing
As far as official regulations are concerned, requirements of ADFs. At first they present a coverage-based and data-
for the approval of ADF are not completely defined yet. driven approach for scenario generation for their simulation
However, as a first step towards an official document in this pipeline. Secondly, they show how to evaluate the behavior of
field, there exists a proposal of the United Nations Economic the simulation model with real data with the help of statistical
Commission for Europe (UNECE) for the approval of vehicles validation metrics and regression techniques. They state that
with an automated lane keeping system (ALKS) [33]. In there are deviations between simulation and reality. In addition
this document the commission presents the requirements the they just use simulation as a tool, i.e. they do not evaluate or
manufacturer must demonstrate to the technical service during attempt to prove and improve the validity of the simulation or
the inspection for approval. It includes all the test specifica- reduce modeling errors.
tions and their pass criteria, parameters and parameter ranges, The work in [38] is based on a collected catalog of real
operator information, data recordings during the drive, cyber high-severity collision scenarios. The authors reproduce those
security aspects and more. Simulation for the verification of scenarios in the simulation and perform a so called ”what-
the safety concept is also mentioned in particular for scenarios if” analysis by replacing one of the human-driven crash
that are difficult to test on a test track or on public roads. participants with an ADF. The ADF can prevent a collision in
However, the manufacturers have to demonstrate the scope of some cases. Respectively, it can at least mitigate the severity of
the specific virtually tested scenarios as well as the validity of the crash and an improvement of traffic safety can be shown.
the simulation tool chain itself. None of the published approaches related to automated
Further framework conditions are provided by ISO26262 driving covers the whole simulation chain and compares the
[34], which defines a process model together with required impact of the ADF in the real world with its impact in the
activities and work products as well as methods to be applied virtual world. They are either methods on how to improve and
in development and production. Part 9 of ISO26262 adds test the ADF in the simulation or are focused on the evaluation
special safety-oriented methods such as Automotive Safety of subsegments of the whole simulation setup. The research
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 5
question in this work addresses the credibility whether real Table I lists the minimum required signal set that must be
driven scenarios can be replaced by virtual driven scenarios. incorporated in the logging system. Timestamp and ego vehicle
For this reason, the whole simulation chain is taken into signals, i.e. global positioning system (GPS) or electronic
account with the focus described at the end of Section I. stability control (ESC), are needed for positioning the ego
vehicle and understanding its driving behavior. The infrastruc-
IV. M ETHODOLOGY ture signals result from the camera and are required for the
detection of the lines of the driving lane. Later they are used
When testing ADFs in simulation two important questions for the identification of lane-change and cut-in maneuvers. The
arise: How valid or beneficial are the tests in the virtual entity signal, which results from typical sensing sensor, are
environment? And how can their validity or benefit be proved? used to calculate the positioning of the surrounding vehicles
This work focuses on a methodology for a valid virtual relative to the ego vehicle and their respective speed and class.
testing procedure for ADFs and indicate how to quantify and
strengthen the credibility of the simulation connected to an
ADF. In the following sections the single steps are presented
TABLE I: Signal logging requirements during the real test
and their importance and necessity are explained in detail.
drive.
Fig. 3 summarizes the conducted steps.
Category Sensor Signal Units
A. Real Test Drive General - timestamp [ms]
The initial step of the proposed method is to perform a real latitude
test drive with an automated vehicle. During the driving the Ego vehicle
GPS longitude [deg]
ADF is activated without any interruptions and is supervised heading
by a safety driver and an additional monitoring system. The ESC speed [m/s]
target parameters, e.g. target velocity, driving style or des- X lane distance left
tination, of the function are settled. Defined data is logged Infrastructure Camera
Y lane distance left
[m]
during the entire trip. These measures ensure that all the X lane distance right
required data for subsequent reproduction in the simulation is Y lane distance right
available. In addition to the target parameters of the function, X distance
[m]
the test vehicle needs to log the bus communication data in Y distance
order to gather all the information from the control units and Entities Camera/Radar/Lidar X speed
[m/s]
environmental sensors, e.g. cameras, radars and lidars. By this Y speed
it is possible to comprehend and replicate the behavior of the class -
test vehicle itself and of the surrounding static and dynamic
entities.
Maneuver
drivers Model Driver Automated Driving Function
Driver
OSC
environments Simulation Scenario Scenario Real Test Drive
Execution Format Extraction
CONFIDENTIAL
Fig. 3: Graph of the proposed method which shows the single steps with its components.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 6
B. Scenario Extraction distant objects that are oscillating between being in and out
The goal of this step is to take the recorded data and extract of a sensor’s detection range.
scenarios for the resimulation. As described in Section II-D After having preprocessed the real test drive data, the next
the identification of scenarios from real test drive data can be step is to convert the information into a simulation format. In
performed in different ways. The aim of this work is not to this case it is recommended to convert the information in to
establish a novel way for identifying scenarios or the creation the provided OSC format in order to be able to compare and
of a comprehensive scenario catalog, but rather find sample exchange scenarios between different use-cases, simulation
scenarios for the validation. For the presented purpose, it is tools and development teams.
sufficient to use a rule-based method in order to find a variety For the transformation to OSC the scenario is divided into a
of scenarios. Two different state machines help to identify sequence of lateral and longitudinal maneuvers, e.g. accelera-
lane-change and car-following scenarios on the highway. Fig. 4 tion, deceleration or lane-change. The simulation is maneuver-
shows two state machines used in this work. In Fig. 4a once based, that means the virtual vehicles are assigned simulation
a vehicle starts approaching a line the data is labeled as time or position-based actions and not fixed trajectories. The
lane-change scenario between S2 and the ending state S4. advantage is, that the scenario is described in a configurable
In Fig. 4b as soon as the requirements between S1 and S2 manner and can be altered in an intuitive and comprehensive
are fulfilled once the distance is kept the labeling starts at S3 way by test engineers. However, while abstracting the real test
and ends with S4. drive data into a sequence of maneuvers information is lost due
to simplifications e.g. linear acceleration.
approach line gain distance
approach
line
cross stop
C. Simulation Environment
S1 S2 line S3 gaining S4
Within a closed-loop simulation setup, a valid virtual en-
gain
distance
vironment model is a prerequisite for testing and validating
ADFs, despite valid further components like sufficient sensor
(a) State machine for lane-change detection. models and a vehicle dynamics model. The quality of the
virtual environment model, also referred to as the digital twin,
keep distance
used for the simulation of automated driving is often not
not lane-change keep change clearly specified and the generation of highly accurate and
S1 & object in front S2 distance S3 distance S4 detailed models is very costly on a large scale. Nevertheless,
comparability with the real world has to be ensured. In the
(b) State machine for car-following detection. context of driving simulation a digital twin of the surrounding
environment is a virtual model with physical objects and
Fig. 4: State machines used for identifying sample scenarios. geometries in the street space. Therefore, as a minimum re-
quirement for driving simulation a virtual environment model
After identifying scenarios, the signal data has to be of a street space is needed, which is comparable to the
converted into simulation-tool-readable format. Thereby, the one in real world where the scenario actually happened.
authors build on previous work that is described in [39] in As a consequence, so called high-definition (HD) maps are
detail. available, which are based on the surveying of traffic networks
At the beginning a preprocessing consisting of three steps is with cameras and/or laser scanners in combination with precise
performed. In the first step a unique identification number (ID) positioning systems. These raw data are transformed in a
is assigned to each traffic participant. Even if an ID is already machine readable format like ODR.
assigned by the multi-sensor system, the ID can be repeated From the perspective of automated driving, the real world
after a certain amount of time and is ambiguous. Therefore, is a complex system that could have an infinite number of dif-
especially for longer scenarios it is essential to clarify which ferent characteristics and interactions. Additionally, simulation
detected object is related to which ID in order to avoid vehicle fidelity is not clearly defined within the generated maps. Con-
hopping. In the second step, an up- and downscaling of the sequently, rebuilding the digital twin by measuring the reality
multi-rate data from the different sensors is performed and is is almost impossible in terms of accuracy and completeness
synchronized to a single frequency. Thereby, it is important of appearing contents. For this reason, the virtual environment
to find a trade-off between upsampling, i.e. interpolation, of used for a specific application in driving simulation has to
low frequency signals and downsampling, i.e. information loss, be checked whether its modeled level-of-detail is sufficient or
of high frequency signals. In this case the lowest common not to assure validity. This can be achieved by identifying
denominator of the sensor periods is applied to ensure a valid the relevant parameters in the virtual environment of the
approach, because the generation of new information during road network including its near surrounding. The environment
the upsampling process, i.e. working with interpolated and model described by the ODR format is represented by a
thus not recorded data, can be seen more critical than loss finite number of parameters. Therefore, relevant parameters
in information. In the third step the algorithm detects and can be identified e.g. by sensitivity analysis by variation of
deletes implausible objects like, that are within the sensor’s the original ODR parameters and the investigation of the
upper detection limit only for a short period of time, i.e. influence on the virtual drive within these samples. With
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 7
E. Credibility Assessment
The aim of the approach is to assess the credibility of the
simulation results with the artefacts generated described in
the steps from Section IV-A to Section IV-D so far. For this
purpose the generated simulation results are compared with
the real test drive data which serves as ground truth. In order t0 V L P X O D W L R Q W L P H tE
to validate whether the ADF behaves equivalently in the real
and virtual drive and to ensure the simulation is sufficiently Fig. 6: Interval for validation comparison.
detailed, a step-wise procedure is presented to ensure relative
credibility of the simulated scenarios which is summarized in the end of the initialization period at t0 . On the other side,
Fig. 5. the length of the extracted scenario must be strictly satisfied.
As one prerequisite the quality and accuracy of the test Immediately after the end of the last extracted maneuver
drive data, i.e. typical noise floor originated from the installed sequence at time tE the ADF still controls the ego vehicle,
sensors, must be known. Furthermore, the assumptions made but the executed maneuvers have no reference in the test drive
for extraction and abstraction of the scenarios has to be data and thus cannot be compared.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 8
Before determining the absolute deviations between the real is conducted within the simulations variants to distinguish
and virtual test drive, the maneuvers sequences are juxtaposed between the different sources of errors occurring.
for a scenario. If the same maneuvers are driven in simulation A large number of signals are available for evaluation in
and are triggered at approximately the same time, an equiva- each time step of the simulation. Therefore, a unique metric
lent behavior of the ADF can be implied. that in the best case represents as much information as possible
In order to test whether the resimulation is appropriate in in one index is desirable in order to achieve a meaningful
case of criticality, a criticality assessment of the investigated expression of the simulation quality. Further requirements are
scenario has to be conducted. As a generic metric the time- the general applicability for all kinds of driving scenarios,
to-collision (TTC) can be used to compute the time until a the ability to combine static environment information with
collision occurs, upon condition that the vehicles involved the vehicle movement as in a closed-loop simulation that
maintain their current speed according to amount and heading information is directly linked to the ego vehicle’s response.
and that they are on a collision course. TTC at a specific time These requirements can be fulfilled with the OSPA metric [40],
i is defined as [41]. The OSPA metric consists of multiple components.
di
T T Ci = (1) OSP A = (dplo + dpca + dpla ) p .
1
(4)
vreli
with the distance d and the relative velocity vrel between The first component dplo
contains the localization error, i.e. the
the vehicle under test and a surrounding object. In literature distance function between two assigned objects. The second
the TTC is defined as critical for values T T Ccrit ≤ 1.5s component dpca corresponds to the cardinality error, i.e. the
[43]. In addition to that a multidimensional criticality analysis unassigned objects compared to ground truth. The labeling
as presented in [44] can also be applied. A scenario can error component dpla penalizes incorrect assignments. In all
be critical (a) or not (b) in the real test drive. In case of components p refers to the order of the used Wasserstein met-
(a) and the simulation also detects critical maneuvers, the ric for calculation. A more detailed computation of the three
methodology can provide a valid virtual testing regarding the components of the OSPA metric can be found in literature
criticality assessment. That is also be valid for case (b) and [40], [41]. Accordingly, the OSPA allows the states of multiple
the resimulation being noncritical. If an opposing criticality of objects to be evaluated over time.
real and virtual behavior is observed, function testing regarding The calculated error values provide a rating about the
safety cannot be applied and resimulation is not credible. absolute difference between the real and virtual test drive
In a next step, the correlation of key signals, i.e position, and gives a first estimation on the quality of the simulation
velocity and acceleration, is computed to identify the signals results in terms of credibility. Additionally, a solely maneuver
that negatively affect the deviations calculated by the more de- simulation is conducted with an ideal model components as
tailed metrics in the following steps. Given two datasets x and well as with a reference model to be able to classify the ADF
y as paired {(x1 , y1 ), . . . , (xn , yn )} the Pearson correlation r test drive.
[45] is defined as So fare the calculated values for the metrics are based
Pn on absolute deviations for specific scenarios. Therefore, a
(xi − x)(yi − y)
r(x, y) = pPni=1 . (2) normalization of the metrics presented above is introduced
2 2
i=1 (xi − x) (yi − y) in order to compare scenario independently and to allow a
The correlation describes a linear dependence of the examined scoring against a threshold. For each metric its normalized
signals and is at its maximum (r = 1) when there is a perfectly value is defined as the probability P (zt ≤ Zth ) with zt as
positive relationship. Conversely, it is at a minimum (r = the value of a metric at timestamp t and a threshold value
−1) when there is an anti-correlation, that is, when there is a Zth of the corresponding metric. As thresholds are dependent
perfectly negative relationship. r = 0 means that there is no on the requirements defined for a specific test case an overall
linear relationship between the datasets. This helps to improve credibility assessment can only be derived relatively to the
potential uncertainties of the input models as well as only thresholds. For a final decision binding for approval a link
translational displacements of the driven maneuvers. to the requirements which have to be fulfilled has to be
In order to compare the credibility of simulation results to established.
real test drive data, quantitative metrics are needed. There is a
series of standard metrics for computer simulation validation V. R ESULTS
[46]. The most widely applied metric is root mean squared The test drive for the experiment of this work is carried out
error (RMSE) [45] in a pre-development prototype vehicle equipped with a data
r Pn logging system and an ADF. The test drive is recorded on the
2
i=1 kxi − yi k digital test field on the highway A9 with three lanes in the
RM SE = , (3)
n proximity of the city of Ingolstadt. For this region a pool of
which computes the average deviations in form of the eu- map data with road networks in the ODR format exists. As
clidean distances. As RM SE > 0 by definition, lower values submicroscopic simulation tool Virtual Test Drive (VTD) [47]
indicate a better fit between two datasets than higher ones. In is used. However, the proposed methodology is based on the
the evaluations the RMSE is not used to directly compare the ASAM simulation standards and is thus compatible with any
trajectories driven by the vehicles, instead a deeper analysis supporting simulation tool.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation Systems
OJ-ITS-2021-10-0081.R1 9
4 AUDI AG I/XX Präsentationstitel Datum
Ego vehicle • The real test drive of the prototype automated vehicle
Dynamic Objects driving in the real world, which refers to as Real Function
Drive (RFD).
• The virtual test drive data of the virtual automated vehicle
driving in the virtual world, which refers to as Virtual
Function Drive (VFD).
• The virtual maneuver-based OSC drive, where the ego
vehicle retraces extracted maneuvers, which refers to as
Virtual Maneuver Drive (VMaD).
• The virtual model-based drive, where the ego vehicle is
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/OJITS.2022.3140493, IEEE Open
Journal of Intelligent Transportation
5 ) ' Systems
OJ-ITS-2021-10-0081.R1 10
W L P H >