Autonomous Racing Robot With an Arduino, a Raspberry Pi and a Pi Camera
For a racing competition in Toulouse, a friend and I designed and programmed an autonomous racing robot powered by a Raspberry Pi, an Arduino Uno and a Pi Camera. We used Python, C++ and a neural network for image processing, operating in real time at 60 FPS! In this article, we share our experience and give the key elements to reproduce the car.
We also release the 3D models , the source code (https://github.com/sergionr2/RacingRobot) along with the data we used to trained our robot.
Note: no knowledge in machine learning is required to read this article, but it could help to understand the very details of the image processing method
After showing the racing car in action, we shortly present the robotics competition and then describe the main components of the car (hardware/software). Finally, we will dive into the internals of the robot: how we made the car autonomous.
April 2018 Update: The Arduino — Raspberry Pi communication is described in this article: https://medium.com/@araffin/simple-and-robust-computer-arduino-serial-communication-f91b95596788
January 2019 Update: Follow up to this project, using reinforcement learning: https://towardsdatascience.com/learning-to-drive-smoothly-in-minutes-450a7cdb35f4
April 2022 Update: I did a series of videos on learning to race with reinforcement learning: https://www.youtube.com/watch?v=ngK33h00iBE&list=PL42jkf1t1F7dFXE7f0VTeFLhW0ZEQ4XJV
The Result
(Please watch the entire video, there are bonuses at the end ;))
The Race
The race consists in an autonomous time trial on a 110 meters track (~120 yards) with a black&white line in the center. Because of the illumination changes, that was a challenge to do computer vision instead of using a range sensor (e.g. an Ultrasonic Sensor).
Our strategy was to follow the line in the middle. It prevents the car from being too close to the walls and allows to anticipate turns. However, we must keep the line in sight and be robust to illumination changes.
The Building Blocks : Hardware
- Chassis from an old Nikko RC Car (eg. this one)
- Raspberry Pi 3 + Camera Module
- Arduino Uno + Custom Shield
- DC Motor Controller
- 3D printed parts(cf README links)
- Servomotor
- Emergency Stop Button (Essential !)
- Batteries: original Nikko battery for the motor + External battery for the Raspberry Pi + Arduino
What is a Raspberry Pi ?
A Raspberry Pi is a credit card-sized computer originally designed for education (source: opensource.com)
What is an Arduino ?
In a nutshell, an Arduino is an open hardware development board that can be used by tinkerers, hobbyists, and makers to design and build devices that interact with the real world. (source: opensource.com)
Software
- C++ on the Arduino
- Python + Numpy + OpenCV on the Raspberry Pi
- Lasagne + Theano for training the neural network
- Homemade serial protocol for Arduino <-> Raspberry Pi communication (cf Appendix)
Architecture Overview
The image processing and control is done on the Raspberry Pi. It communicates with the Arduino that sends orders to the motors (direction and speed) using Pulse-Width Modulation (PWM). We detail the communication between the Arduino and the Raspberry Pi in the appendix.
After briefly presenting the result on the simulated environment in the next section, we explain our image processing method to detect the line and describe our controller that was used to follow the line.
Simulator
We created a simulator using Blender and Python to test our computer vision algorithm along with our control strategy. The result, which was promising, can be found below. We randomly shake the input image to test the robustness of our approach.
Image Processing : Detecting the Line
To control the car along the track, we need to detect the line and anticipate turns: this allows to go at full speed in straight line and reduce speed before turning. This is done in two steps by processing the image coming from the camera. We will first present how we predict the line curvature after line detection and then focus on the two approaches we explored to spot the line center.
Predicting Line Curve
In order to anticipate turns and to reduce the computation time, we apply the line detection on three regions of interest (ROI), i.e. we crop the input image.
We then fit a line (green line on the image above) to the three points obtained and compute the angle between this line and a reference (blue line on this image). This gives us an estimator of how sharp is the turn. If we are on a straight line, the two lines are confounded.
Color Based Approach
Our initial approach to detect the line was to use a color based method. The basic idea is to find the biggest area of a given color and then compute the centroid (“center”) of that area.
The first step is to convert the image in HSV color space, then compute a color mask based on pre-defined thresholds and finally find the centroid which should be the center of the line.
In 30 lines of python code, it looks like that:
Machine Learning Approach
The main drawback of the previous method is that it is not robust to illumination changes. We tried histogram equalization to overcome this issue but this was not sufficient and computationally costly.
So, we decided to apply machine learning to detect the line, that is to say we wanted to train a model than can predict where is the line given an image as input. I chose to use a neural network because that was the method I’m the most familiar with and it was easy to implement using pure numpy and python code.
Overview of the Regression Problem
In a supervised learning setting, that is to say, when we have labeled data, the goal is to predict the label given the input data (e.g. predict if an image contains a cat or a dog). In our case, we wanted to predict the coordinates of the line center given an input image from the camera.
We simplified the problem by predicting only the x-coordinate (along the width) of the line center, given a region of the image, i.e. we assumed that the center is located at half of the height of the cropped image.
To evaluate how good is our model, we chose the Mean Squared Error (MSE) loss as the objective: we take the squared error between the x-coordinate of the predicted and the true line center and average it over all the training samples.
Image Labeling
After recording a video in remote control mode, we manually labeled 3000 images (in ~25 minutes, i.e. 1 label/s). For that purpose, we created our own labeling tool: each training image is shown one by one, we have to click on the center of the white line and then press any key to pass to the next image.
Preprocessing and Data Augmentation
Several steps are required before applying our learning algorithm on the data. First, we resize the input images to reduce the input dimension (by a factor of 4), it allows to drastically cut down the number of learned parameters. That simplifies the problem and accelerates both training and prediction time.
To avoid learning issues and speed up training, it is a good practice to normalize the data. In our case, we normalized the input images in the range [-1, 1] and scaled the output (predicted x-coordinate of the center) to [0, 1].
The preprocessing script can be found below:
To increase the number of training samples, we flipped vertically the images, multiplying the size of the training set by 2 in a quick and cheap way.
Neural Network Architecture
We used a feed forward neural network composed of two hidden layers with respectively 8 and 4 units. Although we experimented other architectures, including CNNs, this one achieved good results and could run in real time at more than 60 FPS!
Hyperparameters
hyperparameters are parameters whose values are set prior to the commencement of the learning process. By contrast, the values of other parameters are derived via training. (definition from Wikipedia)
Hyperparameters include the network architecture, the learning rate, the minibatch size, …
To validate the hyper-parameters choice, we split the dataset into 3 subsets: a training set (60%), a validation set (20%) and a test set (20%). We kept the model with the lowest error on the validation set and estimated our generalization error using the test set. The hyperparameters details can be found in the appendix. The network was trained in less than 20 minutes on a 8-core laptop CPU.
Controller: Following the Line
Once we have processed the image and computed our deviation from the center of the line, we need to correct our error by regulating the car direction and speed.
For that purpose, we used a classic proportional-derivative (PD) controller to follow the line. The speed is regulated using two heuristics: current deviation to the line center (which is the error) and how sharp is the line curve. The bigger the error, the lower the speed (idem for the line curvature).
The idea of our control strategy can be summarized in two lines:
command = Kp * e + Kd * (de/dt) # where "e" is the error
speed = MIN_SPEED * h + (1 - h) * MAX_SPEED# WHERE h = 0 if it is a straight line and 1 if it's a sharp turn
# MAX_SPEED depends on the error "e" in the same manner,
# that is to say the bigger the error, the lower the MAX_SPEED
We found that the turn estimator was quite noisy, so we decided to do an moving average on its value. It improved the stability of the control.
Conclusion
We described how we built an autonomous racing car, with a control based on computer vision. We enjoyed developing this robot and wanted to share our experience. Our goal is that you can re-use, or get inspired by what we have done.
We hope you enjoyed this article, and remember, sharing is caring =D.
If you have any question, any remark (or if you have found a typo), please leave a comment below ;).
Appendix
Hyperparameters used for training
The network was trained using Lasagne (high level wrapper on Theano), during 1000 epochs, with a minibatch size of 1, ADAM as the optimizer with an initial learning rate of 1e-5. We also added a dropout with a probability of 0.1 on the input and a L2 penalty of 1e-4 on the weights.
Arduino — Raspberry Pi Communication: Serial Protocol
To make communication possible between the two cards, and because Arduino does not provide an efficient way to write on the serial port, we used a homemade Serial protocol, based on the single byte writing Arduino method Serial.write().
The protocol is as follows: we first send an order that is one byte (8 bits) long and then send byte per byte the required parameters.
We have implemented methods to encode and decode integers of length ranging from one to four bytes. For instance, when sending a 16 bits int, we cut it into two bytes (8 bits each) and store those bytes in a buffer array. Then, we send each byte in the buffer and reconstruct the 16 bits at the reception using bitwise operations (shifting and masking).
Our implementation also takes into account the limited buffer size of the Arduino: if we send too much bytes in a short amount of time, part of the messages will be lost. To avoid that issue, the Arduino card acknowledges receipt of each order through a “received” message. This “received” message increments a counter that is decremented each time the Raspberry Pi sends an order. This counter must be greater than one to send an order to the Arduino. In Python it is implemented as a semaphore.
I wrote both C++ (Arduino and C++) and Python binding of that protocol (Rust binding is coming… ;)), along with an interactive command parser (very useful for debugging =)). It was also used by our team during the French Cup of Robotics 2017 (where we finish 4th/142 =D!)
Why did we use python on the Raspberry Pi ?
At the beginning of the project, we planned to write everything in C++ for performance reasons. However, we did not manage to retrieve more than 30 frames per second (FPS) with the raspicam library, where the picamera python library allows to retrieve up to 90 FPS.
ROS Interface
I am currently working on building ROS nodes to use the car along with the camera. An example of teleoperation mode using two ROS nodes can be found on the provided linux image (cf README for installation)