0% found this document useful (0 votes)

20 views11 pages

21BAI10450 MATLABReport

The document describes training a biped robot to walk using reinforcement learning. The robot is modeled with joints that can be controlled to apply torque signals. An agent uses observations of the robot's state to determine actions that maximize a reward function for moving forward while avoiding falls or loss of balance. The robot was successfully trained to walk in a straight line with minimal controls.

Uploaded by

meet.joysar2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views11 pages

21BAI10450 MATLABReport

Uploaded by

meet.joysar2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Lab1 Train Biped Robot to Walk Using Reinforcement Learning Agents

Aim:
The reinforcement learning environment for this example is a biped robot. The
training goal is to make the robot walk in a straight line using minimal control
effort.

Theory:

For this model:

• In the neutral 0 rad position, both of the legs are straight and the ankles are flat.
• The foot contact is modeled using the Spatial Contact Force block.
• The agent can control 3 individual joints (ankle, knee, and hip) on both legs of
the robot by applying torque signals from -3 to 3 N·m. The actual computed
action signals are normalized between -1 and 1.

The environment provides the following 29 observations to the agent.

• Y (lateral) and Z (vertical) translations of the torso center of mass. The

translation in the Z direction is normalized to a similar range as the other
observations.
• X (forward), Y (lateral), and Z (vertical) translation velocities.
• Yaw, pitch, and roll angles of the torso.
• Yaw, pitch, and roll angular velocities of the torso.
• Angular positions and velocities of the three joints (ankle, knee, hip) on both
legs.
• Action values from the previous time step.
The episode terminates if either of the following conditions occur.

• The robot torso center of mass is less than 0.1 m in the Z direction (the robot
falls) or more than 1 m in the either Y direction (the robot moves too far to the
side).
• The absolute value of either the roll, pitch, or yaw is greater than 0.7854 rad.

This reward function encourages the agent to move forward by providing a positive reward for
positive forward velocity. It also encourages the agent to avoid episode termination by providing
a constant reward ( ) at every time step. The other terms in the reward function are

penalties for substantial changes in lateral and vertical translations, and for the use of excess
control effort.

Procedure:
The robot was trained in following manner:
1. Use matlab command to get the Walking Bipedal robot example:
openExample('control_deeplearning/TrainBipedRobotToWalkUsingRein
forcementLearningA
gentsExample')
2. This command loads the matlab file. Open the .mlx file as it is the main
notebook.
3. We can execute and use the pretrained model of the robot by clicking
on “Run all” in the nav
bar.
4. The file allows us to explore two techniques to train the robot, either
use DDPG or TD3
5. DDPG:
a. Overview: DDPG is a reinforcement learning algorithm designed for
environments
with continuous action spaces, combining actor-critic architecture with
deep neural
networks.
b. Actor-Critic Structure: It comprises an actor network that learns a
deterministic
policy and a critic network that estimates the action-value function (Q-
function).
c. Experience Replay and Target Networks: DDPG utilizes experience
replay buffer to
store past experiences and target networks to provide more stable target
values during
training.
d. Off-Policy Learning with Soft Updates: It learns from past experiences,
allowing for
efficient use of data, and employs soft updates to gradually blend current
and target
network weights, improving stability in training

Simulation:

Conclusion:
We trained the biped robo to walk linearly with minimal controls via
reinforcement learning.
Exp.2.> Car parking safely RL

Aim:
Automatically parking a car that is left in front of a parking lot is a challenging
problem. The vehicle's automated systems are expected to take over control and
steer the vehicle to an available parking spot.

Theory:

For example:
• Front and side cameras for detecting lane markings, road signs (stop
signs, exit markings, etc.), other vehicles, and pedestrians
• Lidar and ultrasound sensors for detecting obstacles and calculating
accurate distance measurements
• Ultrasound sensors for obstacle detection
• IMU and wheel encoders for dead reckoning
On-board sensors are used to perceive the environment around the vehicle. The
perceived environment includes an understanding of road markings to interpret
road rules and infer drivable regions, recognition of obstacles, and detection of
available parking spots.
As the vehicle sensors perceive the world, the vehicle must plan a path through
the environment towards a free parking spot and execute a sequence of control
actions needed to drive to it. While doing so, it must respond to dynamic
changes in the environment, such as pedestrians crossing its path, and readjust
its plan.

Procedure:
1. Loading the environment
2.

3. Create a vehicleDimensions object for storing the dimensions of the

vehicle that will park automatically.

4. The table has three variables: StartPose, EndPose,

and Attributes. StartPose and EndPose specify the start and end poses of
the segment, expressed as . Attributes specifies properties of the
segment such as the speed limit.
5. Ploting a vehicle at the current pose, and along each goal in the route
plan.

6. Create a plannerHybridAStar (Navigation Toolbox) object to configure a

path planner using a hybrid A* approach. A* planning algorithms find an
optimal path between two points by constructing a tree of connected,
collision-free vehicle poses.

7. Plan a local trajectory starting at the current pose and closely following
the reference path using the controllerTEB object.
8. Now combine all the previous steps in the planning process and run the
simulation for the complete route plan. This process involves
incorporating the behavioral planner.

Final Code
% Set the vehicle pose back to the initial starting point.
currentPose = [4 12 0]; % [x, y, theta]
vehicleSim.setVehiclePose(currentPose);

% Reset velocity.
currentVel = 0; % meters/second
vehicleSim.setVehicleVelocity(currentVel);

% Initialize variables to store vehicle path.

refPath = [];
localPath = [];

% Setup pathAnalyzer to trigger update of local path every 1 second.

localPlanningFrequency = 1; % 1/seconds
pathAnalyzer.PlanningPeriod = 1/localPlanningFrequency/sampleTime; % timesteps

isGoalReached = false;

% Initialize count incremented each time the local planner is updated

localPlanCount = 0; % Used for visualization only

showFigure(vehicleSim); % Show vehicle simulation figure.

while ~isGoalReached
% Plan for the next path segment if near to the next path segment start
% pose.
if planNextSegment(behavioralPlanner, currentPose, 2*maxLocalPlanDistance)
% Request next maneuver from behavioral layer.
[nextGoal, plannerConfig, speedConfig] =
requestManeuver(behavioralPlanner, ...
currentPose, currentVel);

% Plan a reference path using A* planner to the next goal pose.

if isempty(refPath)
nextStartRad = [currentPose(1:2) deg2rad(currentPose(3))];
else
nextStartRad = refPath(end,:);
end
nextGoalRad = [nextGoal(1:2) deg2rad(nextGoal(3))];
newPath = plan(motionPlanner, nextStartRad, nextGoalRad,
SearchMode="exhaustive");

% Check if the path is valid. If the planner fails to compute a path,

% or the path is not collision-free because of updates to the map, the
% system needs to re-plan. This scenario uses a static map, so the path
% will always be collision-free.
isReplanNeeded = ~checkPathValidity(newPath.States, costmap);
if isReplanNeeded
warning("Unable to find a valid path. Attempting to re-plan.")

% Request behavioral planner to re-plan

replanNeeded(behavioralPlanner);
else
% Append to refPath
refPath = [refPath; newPath.States];
hasNewPath = true;

vehicleSim.plotReferencePath(refPath); % Plot reference path

end
end

% Update the local path at the frequency specified by

% |localPlanningFrequency|
if pathUpdateNeeded(pathAnalyzer)
currentPose = getVehiclePose(vehicleSim);
currentPoseRad = [currentPose(1:2) deg2rad(currentPose(3))];
currentVel = getVehicleVelocity(vehicleSim);
currentAngVel = getVehicleAngularVelocity(vehicleSim);

% Do local planning
localPlanner.Map = getLocalMap(costmap, currentPose,
maxLocalPlanDistance);

if hasNewPath
localPlanner.ReferencePath = refPath;
hasNewPath = false;
end

[localVel, ~, localPath, ~] = localPlanner(currentPoseRad, [currentVel

currentAngVel]);

vehicleSim.plotLocalPath(localPath); % Plot new local path

% For visualization only

if mod(localPlanCount, 20) == 0
snapnow; % Capture state of the figures
end
localPlanCount = localPlanCount+1;

% Configure path analyzer.

pathAnalyzer.RefPoses = [localPath(:,1:2) rad2deg(localPath(:,3))];
pathAnalyzer.Directions = sign(localVel(:,1));
pathAnalyzer.VelocityProfile = localVel(:,1);
end

% Find the reference pose on the path and the corresponding

% velocity.
[refPose, refVel, direction] = pathAnalyzer(currentPose, currentVel);

% Update driving direction for the simulator.

updateDrivingDirection(vehicleSim, direction);

% Compute steering command.

steeringAngle = lateralControllerStanley(refPose, currentPose, currentVel, ...
Direction=direction, Wheelbase=vehicleDims.Wheelbase, PositionGain=4);

% Compute acceleration and deceleration commands.

lonController.Direction = direction;
[accelCmd, decelCmd] = lonController(refVel, currentVel);

% Simulate the vehicle using the controller outputs.

drive(vehicleSim, accelCmd, decelCmd, steeringAngle);

% Get current pose and velocity of the vehicle.

currentPose = getVehiclePose(vehicleSim);
currentVel = getVehicleVelocity(vehicleSim);

% Check if the vehicle reaches the goal.

isGoalReached = helperGoalChecker(nextGoal, currentPose, currentVel,
speedConfig.EndSpeed, direction);

% Wait for fixed-rate execution.

waitfor(controlRate);
end

Simulation:
Conclusion:
1. With this experiment, we understood the working of a hybrid RL system.
The system uses RL only for parking which is the difficult task instead of
using RL even for moving around the parking lot
2. The car was able to park safely based on the distance from obstacles.
3. If we change the task of moving around the lot by implementing PPO in it
also and increase the camera depth so that we can perceive a lot more
area, we can reduce the time required to park. It would by as if the car
had a bird’s eye view of the parking and knew where the parking is
beforehand

Iso 13385-2 - 2011
No ratings yet
Iso 13385-2 - 2011
8 pages
Super Drive Tut 3
No ratings yet
Super Drive Tut 3
4 pages
Philosophical Paper
No ratings yet
Philosophical Paper
4 pages
SP Runk 2008
No ratings yet
SP Runk 2008
110 pages
Robot Navigation in Dynamic Environment
No ratings yet
Robot Navigation in Dynamic Environment
4 pages
Mobile Robot Navigation Using A Behavioural Strategy
No ratings yet
Mobile Robot Navigation Using A Behavioural Strategy
16 pages
Robotics Simulation GROUP C
No ratings yet
Robotics Simulation GROUP C
5 pages
FYP Presentation
100% (1)
FYP Presentation
81 pages
Path Following For A Differential Drive Robot - MATLAB & Simulink Example
No ratings yet
Path Following For A Differential Drive Robot - MATLAB & Simulink Example
8 pages
Developing Path Planning With Behavioral Cloning and Proximal Policy Optimization For Path-Tracking and Static Obstacle Nudging
No ratings yet
Developing Path Planning With Behavioral Cloning and Proximal Policy Optimization For Path-Tracking and Static Obstacle Nudging
6 pages
Documentation
No ratings yet
Documentation
27 pages
Control For Mobile Robot
No ratings yet
Control For Mobile Robot
71 pages
Ai Lab 9 10
No ratings yet
Ai Lab 9 10
1 page
Motion Planning and Control For FRC
No ratings yet
Motion Planning and Control For FRC
46 pages
MEC8024 Coursework - Railway Bogie Simulation
No ratings yet
MEC8024 Coursework - Railway Bogie Simulation
3 pages
Chưa biết - rsa-1-2-22109
No ratings yet
Chưa biết - rsa-1-2-22109
5 pages
SLAM Simulation Part
No ratings yet
SLAM Simulation Part
15 pages
Control Strategies For Mobile Robot With Obstacle Avoidance
No ratings yet
Control Strategies For Mobile Robot With Obstacle Avoidance
11 pages
Path Planning Using Dynamic Vehicle Model: Navigation
No ratings yet
Path Planning Using Dynamic Vehicle Model: Navigation
6 pages
MTRN4010 - Laboratory Exercise (Weeks 7-10) Control of Mobile Robots - Simulation Studies Objective
No ratings yet
MTRN4010 - Laboratory Exercise (Weeks 7-10) Control of Mobile Robots - Simulation Studies Objective
5 pages
Autonomous Vehicle Navigation: Homework
No ratings yet
Autonomous Vehicle Navigation: Homework
6 pages
Paper 597
No ratings yet
Paper 597
6 pages
Unit 3 Edt2
No ratings yet
Unit 3 Edt2
8 pages
A Neural Network-Based Navigation Approach
No ratings yet
A Neural Network-Based Navigation Approach
17 pages
KM of Diffrential Wheel Drive Robot Assignment
No ratings yet
KM of Diffrential Wheel Drive Robot Assignment
5 pages
Final Path Plan
No ratings yet
Final Path Plan
5 pages
Lab 5
No ratings yet
Lab 5
4 pages
Mae 493G, Cpe 493M, Mobile Robotics: 11. Introduction To Robot Planning
No ratings yet
Mae 493G, Cpe 493M, Mobile Robotics: 11. Introduction To Robot Planning
24 pages
MT 461 Path Planning in MobileRobots
No ratings yet
MT 461 Path Planning in MobileRobots
6 pages
Assignment Report #4: Obstacle Avoidance Algorithm
No ratings yet
Assignment Report #4: Obstacle Avoidance Algorithm
12 pages
Robotics Ug
No ratings yet
Robotics Ug
958 pages
Robotics Project
No ratings yet
Robotics Project
9 pages
Neural Networks Based Reinforcement Learning For Mobile Robots Obstacle Avoidance
No ratings yet
Neural Networks Based Reinforcement Learning For Mobile Robots Obstacle Avoidance
12 pages
Tangent Bug
No ratings yet
Tangent Bug
4 pages
2004 - An Effective Robot Trajectory Planning Method Using A Genetic Algorithm - Tian
No ratings yet
2004 - An Effective Robot Trajectory Planning Method Using A Genetic Algorithm - Tian
16 pages
Paper Dynamic Fuzzy Control
No ratings yet
Paper Dynamic Fuzzy Control
8 pages
2.robot Dynamics - 20 - 08 - 24
No ratings yet
2.robot Dynamics - 20 - 08 - 24
14 pages
Obstacle Avoidance: Local Map Independent Task
No ratings yet
Obstacle Avoidance: Local Map Independent Task
13 pages
Introduction To Mobile Robotics: Path Planning and Collision Avoidance
No ratings yet
Introduction To Mobile Robotics: Path Planning and Collision Avoidance
56 pages
Introduction To Mobile Robotics: Path Planning and Collision Avoidance
No ratings yet
Introduction To Mobile Robotics: Path Planning and Collision Avoidance
56 pages
Farhan Hasan - Masters - Thesis
No ratings yet
Farhan Hasan - Masters - Thesis
88 pages
Robotics: Bug Algorithm Simulation
75% (4)
Robotics: Bug Algorithm Simulation
13 pages
Final Year Project Upgraded
No ratings yet
Final Year Project Upgraded
17 pages
Implementation of VFH
No ratings yet
Implementation of VFH
6 pages
Efficient Optimization-Based Trajectory Planning F
No ratings yet
Efficient Optimization-Based Trajectory Planning F
23 pages
502 61robotics
No ratings yet
502 61robotics
3 pages
Report
100% (1)
Report
3 pages
Dipesh DT
No ratings yet
Dipesh DT
6 pages
Vision 2 Motion Planning 1
No ratings yet
Vision 2 Motion Planning 1
50 pages
Module 3 AIML
No ratings yet
Module 3 AIML
12 pages
NMPC Trajectory Planner For Urban Autonomous Drivi
No ratings yet
NMPC Trajectory Planner For Urban Autonomous Drivi
10 pages
Adaptive Navigation in Collaborative Robots A Rein
No ratings yet
Adaptive Navigation in Collaborative Robots A Rein
20 pages
Integrating Deep Reinforcement Learning With Model-Based Path Planner
No ratings yet
Integrating Deep Reinforcement Learning With Model-Based Path Planner
6 pages
Nakul DT 9 PDF
No ratings yet
Nakul DT 9 PDF
5 pages
Howard Kelly 2007 Optimal Rough Terrain Trajectory Generation For Wheeled Mobile Robots
No ratings yet
Howard Kelly 2007 Optimal Rough Terrain Trajectory Generation For Wheeled Mobile Robots
26 pages
Plot and Navigate A Virtual Maze: Capstone Project
No ratings yet
Plot and Navigate A Virtual Maze: Capstone Project
22 pages
133 Other 470 1 2 20211003
No ratings yet
133 Other 470 1 2 20211003
9 pages
QBot 2 Path Planning Workbook (Instructor)
No ratings yet
QBot 2 Path Planning Workbook (Instructor)
10 pages
Zhenwang Yao
No ratings yet
Zhenwang Yao
1 page
IND (1) Pdfjaswenth
No ratings yet
IND (1) Pdfjaswenth
37 pages
Robot Path Planning For Maze Navigation
No ratings yet
Robot Path Planning For Maze Navigation
5 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
A Star: Fundamentals and Applications
From Everand
A Star: Fundamentals and Applications
Fouad Sabry
No ratings yet
Breaking Spaghetti Nives Bonacic Croatia IYPT 2011
No ratings yet
Breaking Spaghetti Nives Bonacic Croatia IYPT 2011
34 pages
Chat Application Using Java
No ratings yet
Chat Application Using Java
10 pages
Presentation of Joule Thomson Effect
100% (6)
Presentation of Joule Thomson Effect
16 pages
DALL E 3 - OpenAI
No ratings yet
DALL E 3 - OpenAI
8 pages
Physical Characteristics of Optical Fibers
No ratings yet
Physical Characteristics of Optical Fibers
7 pages
B7 CREATIVE ARTS First-Term 2024 DEC EXAMS
No ratings yet
B7 CREATIVE ARTS First-Term 2024 DEC EXAMS
6 pages
Pengertian Narrative Text Kls 2
No ratings yet
Pengertian Narrative Text Kls 2
11 pages
Aztecs Primary Homework Help
100% (1)
Aztecs Primary Homework Help
4 pages
Rulebook - Preparatory Works-Prefeasibility and Feasibility Study Content and Scope
No ratings yet
Rulebook - Preparatory Works-Prefeasibility and Feasibility Study Content and Scope
8 pages
Frozen Desserts
No ratings yet
Frozen Desserts
27 pages
Hy Panel Supreme
No ratings yet
Hy Panel Supreme
3 pages
Risk Assessment Sheet V2
No ratings yet
Risk Assessment Sheet V2
11 pages
Effect of Elemental Sulfur On Pitting Corrosion of Steels
No ratings yet
Effect of Elemental Sulfur On Pitting Corrosion of Steels
8 pages
(INTERNET) Friedrich Nietzsche
No ratings yet
(INTERNET) Friedrich Nietzsche
90 pages
Arman, Nepal - Wikipedia
No ratings yet
Arman, Nepal - Wikipedia
2 pages
Just A Pretty Face
No ratings yet
Just A Pretty Face
2 pages
8085 Block Diagram and Pin Diagram
No ratings yet
8085 Block Diagram and Pin Diagram
38 pages
Sahodaya Pre Board Examination - 2021-22: Class-XII
No ratings yet
Sahodaya Pre Board Examination - 2021-22: Class-XII
6 pages
JY-HSR3323 Boat Liferaft & Crane
No ratings yet
JY-HSR3323 Boat Liferaft & Crane
29 pages
All INDIA JE & PSU Electrical Engineering Volume 4
No ratings yet
All INDIA JE & PSU Electrical Engineering Volume 4
848 pages
Radar Receivers
100% (2)
Radar Receivers
15 pages
Eng - Avionics PTC 2019
No ratings yet
Eng - Avionics PTC 2019
186 pages
Transport Phenomena 1
No ratings yet
Transport Phenomena 1
8 pages
Minor Project File
No ratings yet
Minor Project File
29 pages
Tos Math 7
No ratings yet
Tos Math 7
1 page
Identify Your Helpers of Destiny
90% (10)
Identify Your Helpers of Destiny
6 pages
Bellman Ford PRESENTATION
No ratings yet
Bellman Ford PRESENTATION
9 pages

21BAI10450 MATLABReport

Uploaded by

21BAI10450 MATLABReport

Uploaded by

Lab1 Train Biped Robot to Walk Using Reinforcement Learning Agents

For this model:

The environment provides the following 29 observations to the agent.

• Y (lateral) and Z (vertical) translations of the torso center of mass. The

3. Create a vehicleDimensions object for storing the dimensions of the

4. The table has three variables: StartPose, EndPose,

6. Create a plannerHybridAStar (Navigation Toolbox) object to configure a

% Initialize variables to store vehicle path.

% Setup pathAnalyzer to trigger update of local path every 1 second.

% Initialize count incremented each time the local planner is updated

showFigure(vehicleSim); % Show vehicle simulation figure.

% Plan a reference path using A* planner to the next goal pose.

% Check if the path is valid. If the planner fails to compute a path,

% Request behavioral planner to re-plan

vehicleSim.plotReferencePath(refPath); % Plot reference path

% Update the local path at the frequency specified by

[localVel, ~, localPath, ~] = localPlanner(currentPoseRad, [currentVel

vehicleSim.plotLocalPath(localPath); % Plot new local path

% For visualization only

% Configure path analyzer.

% Find the reference pose on the path and the corresponding

% Update driving direction for the simulator.

% Compute steering command.

% Compute acceleration and deceleration commands.

% Simulate the vehicle using the controller outputs.

% Get current pose and velocity of the vehicle.

% Check if the vehicle reaches the goal.

% Wait for fixed-rate execution.

You might also like