0% found this document useful (0 votes)
7 views33 pages

Ansys RTR RL Presentation

The document outlines an experiment to integrate Ansys Real-Time radar with the CARLA driving simulator to enhance vehicle longitudinal control using reinforcement learning. It aims to demonstrate high-fidelity sensor modeling, reduce development costs, and improve training efficiency through virtual simulations. Key components include deep reinforcement learning architectures, simulation tool chains, and the Open Simulation Interface for modular integration.

Uploaded by

fort212121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views33 pages

Ansys RTR RL Presentation

The document outlines an experiment to integrate Ansys Real-Time radar with the CARLA driving simulator to enhance vehicle longitudinal control using reinforcement learning. It aims to demonstrate high-fidelity sensor modeling, reduce development costs, and improve training efficiency through virtual simulations. Key components include deep reinforcement learning architectures, simulation tool chains, and the Open Simulation Interface for modular integration.

Uploaded by

fort212121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Training a Vehicle Longitudinal Controller

Using Reinforcement Learning


CARLA Open-Source Driving Simulator
Ansys Real-Time Radar Model

Dr. techn. Kmeid Saad


Senior Principal Application Engineer
Goals of This Experiment
1. Integrate the Ansys Real-Time radar model with the CARLA driving simulator
• Demonstrate high fidelity sensor modeling at low cost
• Physics-based and high correspondence to real world sensors
• Real-time, or faster than real-time performance
• Easy integration with open-source simulation platform
• Cross platform solution using the OSI standard

2. Discover how reinforcement learning coupled with neural networks benefits from high fidelity sensing
• Train a reinforcement model to control vehicle speed and braking on obstacle approach
• Leverage raw I/Q data and range-Doppler maps to control training process

3. Demonstrate how to reduce development cost and minimize time to market


• Shift development and testing strategies from the physical to the virtual world
• Challenge perception algorithms with edges case scenarios and isolate their cause

2
Deep Reinforcement Learning

• Deep Reinforcement Learning (DRL), combines reinforcement learning with a class of


artificial neural networks known as deep neural networks.

Input Deep Convolutional Neural Network Output

Some action…

3
Reinforcement Learning

• The agent, meant to perform a certain task, interacts with the environment through a
sequence of observations, actions and rewards.

Agent

State, Reward Action

Environment

• The action of the agent is interpreted in this environment as the actuation of the vehicle
pedals.
• The agent’s goal is to select actions in a fashion that maximizes cumulative future reward.

4
Vehicle Longitudinal Control

• An essential part of autonomous vehicle control system is a longitudinal controller for


accelerating and decelerating the vehicle.
• Longitudinal control systems need to reliably perceive and adapt to rapidly changing
conditions in the driving environment.
• In the context of reinforcement learning:
‐ The system recognizes the relations between its actions and the associated effect on the
environment.
‐ The aim is to train an agent to successfully control the acceleration and deceleration of the vehicle.

The agent’s performance depends highly on its reward function!

5
Vehicle Longitudinal Control

CARLA Driving Ansys Real- Range-Doppler


Post Processing
Time Radar Raw Data
Simulator
Object List

State Update Action Request Path Planning/Control


Algorithm

CARLA Driving Ansys Real- Range-Doppler


DRL Model
Simulator Time Radar

State Update Action Request

6
Open Simulation Interface

• The Open Simulation Interface (OSI) is a specification for interfaces between models
and components of a distributed simulation.

• OSI has a strong focus on environmental perception of automated driving functions.

• OSI was also developed to addresses the emerging standard ISO 23150 for real
sensors' standardized communication interface.

• OSI defines generic interfaces to ensure modularity, integrability, and


interchangeability of the individual components.

7
Open Simulation Interface

• Official interfaces and involved models:

• OSI and additional custom messages:

SensorView Sensor Simulation and RL Model


CARLA Driving Simulator StateUpdate* Ansys Real-Time
Radar Model
Deep Reinforcement Model

RL Model ActionRequest*

RL: Reinforcement Learning

8
Tool Chain Components

The simulation tool chain consist of


two main pillars: CARLA Sensor Simulation and
Server
RL Model
1. The driving simulator represented by • World-state and actors’ update.
• Computation of physics. Ansys Real-Time Radar
CARLA. • ... • Radar parameters and
configuration.
Scalable Client-Server
2. The Real Time Radar and the RL ZMQ • OSI integration.
• Radar simulation.
Model. Client
• Actors’ control. (messaging library)
• Setting world conditions. Deep Reinforcement Model
• ...
• Consumes range-Doppler
• CARLA to OSI format.
images, Environment state
• RL model actions to
CARLA inputs. updates.
• State update output. • Provides RL model actions.

9
CARLA Terrain and Road Network (Town-02) & Driving Loops

10
Simulation Tool Chain – 3D Models

Wind Shield and Car Windows


Vehicle Body

Head and Tail Lamps


Wheel

Number Plates

11
Tool Chain Implementation
Ansys Real Time Radar
• Ansys Real Time Radar (RTR):
‐ Physics-based and high-fidelity simulation.
‐ Multi-Bounce ray tracing and simulation fidelity.
‐ Captures physics beyond line-of-sight sensing.
‐ Support for a variety of electromagnetic material models.
• RTR Data output:
‐ Processed range-Doppler imagery.
‐ Rcvr raw data (I/Q or I_real) data from ADC.

Processed
range-Doppler Raw I/Q Rx
data per Rx channel data
channel (post-ADC)

12
Real-Time Radar Modeling: Waveforms and Outputs
Frequency Modulated Continuous Wave (FMCW)
Common automotive radar waveform
Lower power than pulse-Doppler
Range offset caused by coupling between the range-Doppler shift
freq
fmax Tx 1 Tx 2 Tx 1 Tx 2

fcenter

fmin
Chirp 1 Chirp 2 Chirp N-1 Chirp N time
Coherent Processing Interval (CPI)

Pulse-Doppler Waveform
freq
Tx 1 Tx 2 Tx 1 Tx 2
fmax
fcenter …
fmin
Pulse 1 Pulse 2 Pulse N-1 Pulse N time
Coherent Processing Interval (CPI)

13 ©2021 Ansys, Inc.


Real-Time Single Radar Simulation, Multiple Channels
Simulating raw radar I&Q channel data
• GPU Hardware: Nvidia Quadro GV100
‐ Volta GPU chipset Time per Simulation
‐ 32 GB RAM Number of
frame Frame
Channels
(ms) Rate (Hz)
• Simulating a single radar, multiple channels
‐ Producing raw I/Q channel data
1 3.6 275
‐ Not computing range-Doppler imagery 5 17.5 57
12 27.5 36.4
• Scenario – Long, Busy Street (full
interactions modeled) 15 30 30
‐ 1 km road 110 225 4.44
‐ 70 vehicles
‐ 336 streetlights
‐ 14 buildings
‐ 42 traffic signals

Benchmark data as of Sep. 20, 2020

14
Radar Configuration
Parameter Value
numChannels 1
hpbwHorizDeg 140
hpbwVertDeg 30
centerFreq 76.5e9
bandWidth 300e6
numFreqSamples 200
cpiDuration 0.00979
numPulseCPI 250
rPixels 512
dPixels 384

Ego Vehicle and Radar Mounting Position

15
Real-Time Radar Sensor Integration with CARLA

CARLA
Server
• World-state and actors’ update.
• Sensor rendering.
Running in synchronous
• computation of physics
mode with a fixed fps rate.
• ...
UNREAL Engine
C++

Scalable Client-Server

Client Radar Assets


• Actors’ control. ZMQ
• Radar parameters and
• Setting world conditions configuration.
• ...
Python
(messaging library) • Stand alone with
C++ python bindings.

• Provides Map data.


• Provides 3D models for
• Converts CARLA ground truth data to OSI format.
• Converts OSI format to sensor format. static and dynamic actors.
• Imports an OpenDrive map.
• Hosting ZMQ socket.
• Hosting ZMQ sockets.
• Sensor stand alone version with python bindings.
• Running in synchronous mode with a fixed fps rate.
Radar Sensor Integration with CARLA – Material Properties
OSI::SensorView
OSI::GroundTruth Assets
OSI::SensorViewConfiguration Radar
• Radar parameters and configuration.
OSI::SensorData • Stand alone with python bindings.
DRL Architecture

Input Convolution Fully Connected Output

Range-Doppler

Action

18
DRL Architecture

• The DRL model is based on a Deep Q learning model originally created to perform on
the Atari game “Space Invaders”:
‐ https://github.com/philtabor/Youtube-Code-Repository

• Updated Actions

Actions Brake [0, 1] Throttle [0, 1] Comments


1 0.0 0.0 Do nothing
2 1.0 0.0 Full brake
3 0.0 1.0 Full throttle
4 0.0 0.5 Half throttle

19
DQN Reward Function

• Reward Policy
if_obstacle If_collision target_distance target_speed ego_speed reward
[True, False] [True, False] [m] [m/s] [m/s] [-]
True True - - - -10
True False >20 >0 0 -5
True False >10 =0 =0 -1
False False - - 0 -1
False False - - >0 +1
True False >=20 >0 >0 +1
True False <=20 >0 >0 +5
True False <=10 0 0 +3

20
Supervised Learning
Simulation Tool Chain for Machine Learning

Data Generation Neural Network Training Neural Network Evaluation


Variation Data Set

Data Preparation Inference

Scenario Variation
Scenario Scenario Neural Network Training
Definition Generation Labeled Data Inference
“Ground Truth” Results

Scenario & Sensor


Simulation
Metrics Trained Model Evaluation Metrics
Labeled Data
“Ground Truth”

22
Tool Chain Implementation
Ansys optiSlang

23
Labeled Range-Doppler Maps

• At 40-100 fps we generated over 160,000 labeled images (512x384).

24
Results and Summary
Results

26
Results

Number of episodes Number of valid episodes Number of valid accidents


6469 6023 665

Actions Brake [0, 1] Throttle [0, 1] Comments Usage [%]


1 0.0 0.0 Do nothing 31
2 1.0 0.0 Full brake 20
3 0.0 1.0 Full throttle 26
4 0.0 0.5 Half throttle 23

27
Results After Training:
Avoid Car at Intersection and Following Stop

28
Results After Training:
Avoid False Alarms With Approaching Cars

29
Results After Training:
Handling Issues Not Addressed With Reward Function

30
Results After Training:
Avoiding Collision with Car at Intersection

31
Free Learning Resources at your Fingertips

• Free, online physics and engineering courses


• Include video lectures, accompanying handouts,
simulation exercises, quizzes
• Perfect for use with our free student download
• Visit ansys.com/courses to learn more

32
Questions?
Jeff Blackburn
Senior Product Sales Manager – Ansys Autonomy
[email protected]
650-313-3649

You might also like