Boot Time Labs
Boot Time Labs
Practical Labs
https://bootlin.com
Goals
Implement a live camera system and optimize its boot time.
Here’s a description of the system that we are going to build and optimize in terms of boot time:
Hardware:
• Main board: Beagle Bone Black (Regular or Wireless), with an ARM Cortex A8 SoC
(AM335x from Texas Instruments).
• Extended by a 4 inch LCD cape
• Connected to a USB webcam
Software:
• Bootloader: U-Boot
• Operating system: Linux
• User space: ffmpeg video player
• Build system: Buildroot
• Functionality: as soon as the system has booted, display the video from the USB webcam.
Training setup
Download files and directories used in practical labs
Lab data are now available in an boot-time-labs directory in your home directory. This
directory contains directories and files used in the various practical labs. It will also be used as
working space, in particular to keep generated files separate when needed.
You are now ready to start the real practical labs!
More guidelines
Can be useful throughout any of the labs
• Read instructions and tips carefully. Lots of people make mistakes or waste time because
they missed an explanation or a guideline.
• Always read error messages carefully, in particular the first one which is issued. Some
people stumble on very simple errors just because they specified a wrong file path and
didn’t pay enough attention to the corresponding error message.
• Never stay stuck with a strange problem more than 5 minutes. Show your problem to
your colleagues or to the instructor.
• You should only use the root user for operations that require super-user privileges, such
as: mounting a file system, loading a kernel module, changing file ownership, configuring
1 This tool from Microsoft is Open Source! To try it on Ubuntu: sudo snap install code
the network. Most regular tasks (such as downloading, extracting sources, compiling...)
can be done as a regular user.
• If you ran commands from a root shell by mistake, your regular user may no longer be
able to handle the corresponding generated files. In this case, use the chown -R command
to give the new files back to your regular user.
Example: chown -R myuser.myuser linux/
Git configuration
After installing git on a new machine, the first thing to do is to let git know about your name
and e-mail address:
git config --global user.name ’My Name’
git config --global user.email [email protected]
Such information will be stored in commits. It is important to configure it properly in case we
need to generate and send patches.
As this still represents many git objects to download (about 300 MiB when 5.4 was the latest
version), if you are using an already downloaded git tree, your instructor will probably have
fetched the stable branch ahead of time for you too. You can check by running:
git branch -a
We will choose a particular stable version in the next labs.
Now, let’s continue the lectures. This will leave time for the commands that you typed to
complete their execution (if needed).
Board setup
Objective: setup communication with the board and configure the
bootloader.
Once the USB to Serial connector is plugged in, a new serial port should appear: /dev/ttyUSB0.
You can also see this device appear by looking at the output of dmesg.
To communicate with the board through the serial port, install a serial communication program,
such as picocom:
If you run ls -l /dev/ttyUSB0, you can also see that only root and users belonging to the
dialout group have read and write access to this file. Therefore, you need to add your user to
the dialout group:
Important: for the group change to be effective, in Ubuntu 18.04, you have to completely reboot
the system 3 . A workaround is to run newgrp dialout, but it is not global. You have to run it
in each terminal.
Now, you can run picocom -b 115200 /dev/ttyUSB0, to start serial communication on /dev/
ttyUSB0, with a baudrate of 115200. If you wish to exit picocom, press [Ctrl][a] followed by
[Ctrl][x].
There should be nothing on the serial line so far, as the board is not powered up yet.
It is now time to power up your board by plugging in the mini-USB (BeagleBone Black case) or
micro-USB (BeagleBone Black Wireless case) cable supplied by your instructor (with your PC
or a USB power supply at the other end of the cable).
See what messages you get on the serial line. You should see U-boot start on the serial line.
Bootloader interaction
Reset your board. Press the space bar in the picocom terminal to stop the U-boot countdown.
You should then see the U-Boot prompt:
=>
This step was just to check that the serial line was connected properly. In a later lab, we will
replace the existing bootloader by a version that we compiled ourselves.
3 As explained on https://askubuntu.com/questions/1045993/after-adding-a-group-logoutlogin-is-not-
enough-in-18-04/.
You will find the schematics of the LCD4 cape in the ~/boot-time-labs/hardware/ directory.
This will be useful to know that header pins remain free to use.
4 That’s strongly recommended by the board maker, to avoid hardware damage that can happen if the board
After this lab, you will have a ready to use root filesystem to boot your system with, including
a video player application.
We haven’t compiled the bootloader and kernel for our board yet, but since this part can take
a long time (especially compiling the cross-compiling toolchain), let’s start it now, while we are
still fetching kernel sources or going through lectures.
Setup
As specified in the Buildroot manual5 , Buildroot requires a few packages to be installed on your
machine. Let’s install them using Ubuntu’s package manager:
sudo apt install sed make binutils gcc g++ bash patch \
gzip bzip2 perl tar cpio python unzip rsync wget libncurses-dev
You will later also find that you also need extra packages:
sudo apt install bison flex
Configuring Buildroot
To minimize external dependencies and maximize flexibility, we will ask Buildroot to generate
its own toolchain. This can be better than using external toolchains, as we have the ability to
tweak toolchain settings in a fine way if needed.
Start the Buildroot configuration utility:
make menuconfig
• Target Options menu
– Target Architecture: select ARM (little endian)
– Target Architecture Variant: select cortex-A8
5 https://buildroot.org/downloads/manual/manual.html#requirement-mandatory
– On ARM two Application Binary Interfaces are available: EABI and EABIhf. Unless
you have backward compatibility concerns with pre-built binaries, EABIhf is more
efficient, so make this choice as the Target ABI (which should already be the default
anyway).
– The other parameters can be left to their default value: ELF is the only available
Target Binary Format, VFPv3-D16 is a sane default for the Floating Point Unit, and
using the ARM instruction set is also a good default (we could use the Thumb-2 instruc-
tion set for slightly more compact code).
• System configuration menu
– Unselect remount root filesystem read-write during boot. This way, we will
keep the root filesystem in read-only mode. When we make tests and reboot the
system multiple times, this avoids filesystem recovery with approximately takes 4
seconds and adds jitter to our measurements.
• As we wish to make Buildroot compile its own toolchain as as we will compile the kernel
and bootloader separately, for the moment, we can keep the default build, toolchain and
system settings.
• Target packages menu
– In Audio and video applications, select ffmpeg and inside the ffmpeg submenu,
add the below options:
∗ Select Build libswscale
• For the moment, all the remaining default settings are fine for us.
However, we need to do one thing to customize the root filesystem: we need to add a script that
will automatically start the ffmpeg video player.
To do so, we will use Buildroot’s Root filesystem overlay capability, which allows to drop ready-
made files into the final root filesystem archive6 .
To begin with, let’s start by creating a specific directory to store our Buildroot customizations
for our project.
mkdir board/beaglecam
And in this directory, let’s create a directory for root filesystem overlays:
mkdir board/beaglecam/rootfs-overlay
Now, let’s copy a script that we’re providing you to etc/init.d:
mkdir -p board/beaglecam/rootfs-overlay/etc/init.d/
cp ~/boot-time-labs/rootfs/data/S50playvideo board/beaglecam/rootfs-overlay/etc/init.d/
We can now run make menuconfig again and in System configuration, add board/beaglecam/
rootfs-overlay to Root filesystem overlay directories.
Running Buildroot
We are now ready to execute Buildroot:
make
Enjoy, and be patient, as building a cross-compiling toolchain takes time!
6 See https://buildroot.org/downloads/manual/manual.html#customize for details.
After this lab, you will be able to compile U-Boot for your target platform and run it from a
micro SD card provided by your instructor.
Setup
Go to the ~/boot-time-labs/bootloader/u-boot/ directory.
Let’s use the 2019.01 version:
git checkout v2019.01
Compiling environment
If the previous Buildroot lab is already over, we can use the toolchain it built to compile our
bootloader and kernel:
export PATH=$HOME/boot-time-labs/rootfs/buildroot/output/host/bin:$PATH
export CROSS_COMPILE=arm-buildroot-linux-uclibcgnueabihf-
Otherwise, let’s take a cross-compiling toolchain provided by Ubuntu:
sudo apt install gcc-arm-linux-gnueabihf
export CROSS_COMPILE=arm-linux-gnueabihf-
You will need the same settings when you compile the kernel too, and when you recompiling
U-Boot and the kernel to optimize them. Let’s make such settings permanent by adding the
below lines at the end of your ~/.bashrc file:
export PATH=$HOME/boot-time-labs/rootfs/buildroot/output/host/bin:$PATH
export CROSS_COMPILE=arm-buildroot-linux-uclibcgnueabihf-
Configuring U-Boot
Let’s use a ready-made U-Boot configuration for the Beaglebone Black board (you can find such
configuration files in the configs/ directory:
make am335x_boneblack_defconfig
Compiling U-Boot
Just run:
make
or, to compile faster:
make -j 8
This runs 8 compiler jobs in parallel (for example if you have 4 CPU cores on your workstation...
using more jobs than cores guarantees that the CPUs and I/Os are always fully loaded, for
optimum performance.
You can safely ignore the warnings that you get at the end of the build job. They are meant
to draw the attention to OMAP sub-maintainers in U-Boot, to remind them that they should
update their code that is using deprecated code infrastructure.
At the end, you have MLO and u-boot.img files that we will put on a micro SD card for booting.
• A first partition for the bootloader. It needs to comply with the requirements of the
AM335x SoC so that it can find the bootloader in this partition. It should be a FAT32
partition. We will store the bootloader (MLO and u-boot.img), the kernel image (zImage),
the Device Tree (am335x-boneblack.dtb) and a special U-Boot script for the boot.
• A second partition for the root filesystem. It can use whichever filesystem type you want,
but for our system, we’ll use ext4.
First, let’s identify under what name your SD card is identified in your system: look at the
output of cat /proc/partitions and find your SD card. In general, if you use the internal SD
card reader of a laptop, it will be mmcblk0, while if you use an external USB SD card reader, it
will be sdX (i.esdb, sdc, etc.). Be careful: /dev/sda is generally the hard drive of your
machine!
If your SD card is /dev/mmcblk0, then the partitions inside the SD card are named /dev/
mmcblk0p1, /dev/mmcblk0p2, etc. If your SD card is /dev/sdc, then the partitions inside are
named /dev/sdc1, /dev/sdc2, etc.
1. Unmount all partitions of your SD card (they are generally automatically mounted by
Ubuntu)
2. Erase the beginning of the SD card to ensure that the existing partitions are not going to
be mistakenly detected:
sudo dd if=/dev/zero of=/dev/mmcblk0 bs=1M count=16. Use sdc or sdb instead of mmcblk0
if needed.
• Create a first small partition (128 MB), primary, with type e (W95 FAT16) and mark
it bootable
• Create a second partition, also primary, with the rest of the available space, with
type 83 (Linux).
• Exit cfdisk
At the end of the lab, you’ll have your system completely up and running.
Setup
Go to the ~/boot-time-labs/kernel/linux/ directory and install a package to we will need to
compile the Linux kernel:
First let’s get the latest updates to the remote stable tree (that’s needed if you started from a
ready made archive of the Linux git repository):
First, let’s get the list of branches on our stable remote tree:
git branch -a
As we will do our labs with the Linux 5.1 stable branch, the remote branch we are interested in
is remotes/stable/linux-5.1.y.
First, open the Makefile file just to check the Linux kernel version that you currently have.
Now, let’s create a local branch starting from that remote branch:
This local branch will allow us to keep our modifications to the Linux kernel to support the
LCD4 cape that we’re using.
Open Makefile again and make sure you now have a 5.1.<n> version.
Compiling environment
You need the same PATH and CROSS_COMPILE environment variables as when you compiled U-
Boot, plus the ARCH one that corresponds to the target architecture.
So, add the below line at the end of your ~/.bashrc file:
export ARCH=arm
Now source this file (source ~/.bashrc) in the current terminal, or start a new terminal to get
all needed variables.
• CONFIG_AM335X_PHY_USB=y
• CONFIG_USB_GPIO_VBUS=y
• CONFIG_USB_GADGET_VBUS_DRAW=500
• CONFIG_USB_CONFIGFS_F_UVC=y
For the webcam
• CONFIG_MEDIA_SUPPORT=y
• CONFIG_MEDIA_USB_SUPPORT=y
• CONFIG_USB_VIDEO_CLASS=y
For your convenience, of if you screw up your settings in a later lab, you can also use a reference
configuration file found in boot-time-labs/kernel/data.
Bootloader configuration
Back to the serial console for your board, let’s define the default boot sequence, to load the
kernel and DTB from the external SD card:
setenv bootcmd 'fatload mmc 0:1 81000000 zImage; fatload mmc 0:1 82000000 dtb; bootz 81000000 - 82000000'
The last thing to do is to define the kernel command line:
setenv bootargs console=ttyO0,115200n8 root=/dev/mmcblk0p2 rootwait ro
• rootwait waits for the root device to be ready before attempting to mount it. You may
have a kernel panic otherwise.
• ro mounts the root filesystem in read-only mode.
Last but not least, save your changes:
saveenv
Testing time!
First, connect the USB webcam provided by your instructor, and point it to an interesting
direction ;)
Then, reset your board or power it on, and see it work as expected. If you don’t get what you
expected, check your serial console for errors, and if you’re stuck, show your system to your
instructor.
During this lab, we will use techniques to measure boot time using only software solutions.
Initial measurements
Now, take your calculator and fill the below table with the results from your experiments:
Step Duration Description
U-Boot SPL Between U-Boot SPL 2019.01 and U-Boot 2019.01
U-Boot Between U-Boot 2019.01 and Starting kernel
Kernel Between Starting kernel and Run /sbin/init
Init scripts Between Run /sbin/init and Starting ffmpeg
Application Between Starting ffmpeg and First frame decoded
Total
During this lab, we will use a hardware technique to measure boot time, from cold boot (or
reset) to the instant when the first frame has been decoded.
Arduino setup
Take the Arduino Nano board provided by your instructor. Connect it in the middle of the
breadboard provided too, so that you can connect wires to both sides of the Arduino.
Download the 1.8.9 version of the Arduino IDE from https://www.arduino.cc/ (don’t use the
Arduino package in Ubuntu 18.04, as it has issues connecting to the serial port, even with
root privileges, while the official version works without any problem). Extract the archive in
/usr/local/.
Use the provided USB cable to connect the Arduino to your PC, and start the IDE:
/usr/local/arduino-1.8.9/arduino
Now, configure the IDE for your Arduino:
• In Tools, Board, select Arduino Nano
• In Tools, Processor, select ATmega328p (or ATmega328p old bootloader if you have a
Nano clone)
• In Tools, Port, select ttyUSB1 (or ttyUSB0 if the serial line for your Bone Black board is
no longer connected.
Now are now ready to use your Arduino board:
• Go to Examples, 01. Basics and select Blink. This program allows to blink the LED on
the Nano.
• Press the Upload butter and you should see the sketch work (that’s how the Arduino
community call their programs).
• You can now unplug the Arduino and plug it back. The same program will be started
automatically. Loading a program is just necessary once.
Now, using breadboard wires, connect the GND pin of the Arduino to one of the blue rails of the
breadboard, then to the GND pin of the 7-segment module. Please use blue or black wires!
Similarly, connect the 5V pin of the Arduino to the red rail of the breadboard, then to the 5V
pin of the module. Using red or orange wires is recommended too.
Then, you can connect the Arduino D2 pin to the CLK pin of the module, and the D3 pin to the
DIO pin of the module:
Oops, a library is missing. You could have retrieved it through the IDE’s library manager
(Tools, Manage Libraries), but in this case, we absolutely need its latest version. So, go to
https://github.com/Seeed-Studio/Grove_4Digital_Display, download a zip file and extract
this archive into ~/Arduino/libraries/. Rename Grove_4Digital_Display-master to Grove_
4Digital_Display (removing the -master suffix added by GitHub, and you should be ready to
go.
A first possibility would be to watch the 3.3V VDD pins of the Bone Black board and start
counting when they go high when the board is powered on. However, it would be cumbersome
to have to power off the board each time we wish to make a measurement.
A second possibility is to watch the state of the RESET signal. When this pin goes from high
to low, and back to high, it means that the board starts booting. That’s a good time to start
counting, and doing it after each reset is a convenient solution.
Look at the Bone Black System Reference and find which pin on the P8 or P9 headers is used
to expose the SYS_RESET line.
Look at the schematics for the LCD4 cape. Unfortunately, no reference manual was ever pub-
lished for this cape. However, the schematics are sufficient to find pins that are not used by the
LCD4 cape.
If you look for Bootlin in the Device Tree Source we provided, you can see in the pin definitions
sections that we selected pin 13 or the P9 headers:
If you look at the Expansion Header P9 Pinout table in the Board’s Reference Manual, you will
see that MODE7 allows to get GPIO number 31 on P9’s pin 13.
Back to the pin for SYS_RESET, there is nothing to configure to get it. It’s the only line connected
to the pin.
We are going the good old wire wrapping technique as shown by the instructor to hook up to
pins we want to monitor, with a reliable connection, but without any soldering.
So, take out the LCD4 cape, find which of its header pins are connected to the Bone Black pins
we want to monitor, and then use the wire and tool provided by the instructor to connect to
these pins, because re-attaching the LCD4 cape.
On the other end of the cables, your instructor will also give you small headers that you can
plug into the breadboard holes and do some wiring wrapping on them too.
We need to control the state of the pins we watch when they are not driven.
For the SYS_RESET signal, we are lucky that the ATmega328p CPU pins can be configured as
pull-up, so we just need to configure them in software without having to use our own resistors.
For our custom GPIO pin, there only one choice becase we cannot use pull-up. If we used our
own resistor, we would have a 5V voltage level coming from the Arduino, and the Beagle Bone
Black is not 5V tolerant, as explicited in its manual. Therefore, pull-down is our only option.
As a consequence, as the Arduino just supports pull-up for pins, we will have to use our own
pull-down resistors.
Then which Arduino pins to connect to? As the Beagle Bone Black board has a 3.3V voltage
level, it’s best to use the Arduino’s Analog pins to measure the voltage driven by the Bone
Black with precision. We will measure a small integer value for 0V, and about 700 (out of a
1024 maximum) for 3.3V.
So, let’s use Arduino pin A0 for reset, and pin A1 for the custom GPIO, adding a 1 Kohm
pull-down resistor provided by your instructor:
Toolchain optimization
Get the best cross-compiling toolchain for your application and sys-
tem
The goal of this lab is to find the best toolchain for your application, in terms of performance
and code size. Smaller code can be faster to load, and save time when using in an initramfs
(when the whole filesystem is loaded at once in RAM).
In this lab, we will see how to test an alternative toolchain, measuring:
• Application execution time
• Application and total filesystem size
Application optimization
Optimize the size and startup time of your application
Measuring
We have already measured application startup time in the previous lab.
In our system, we use a generic version of ffmpeg that was built with support for too many
codecs and options that we actually do not need in our very special case.
So, let’s try to find out what the minimum requirements for ffmpeg are.
• A software scaler to resize the input video for our LCD screen
Let’s check ffmpeg’s configure script, and see what its options are:
cd ~/boot-time-labs/rootfs/buildroot-arm/output/build/ffmpeg-3.4.5
./configure --help
We see that configure has precisely three interesting options: --list-encoders, --list-
decoders, --list-filters, --list-outdevs and --list-indevs.
Run configure with each of those and recognize the features that we need to enable.
Following these findings, here’s how we are going to modify Buildroot’s configuration for ffmpeg.
This time, let’s assume that the Thumb2 build from the previous lab has completed. If that’s
the case, finish that lab (measuring and writing down size and performance), and come back
here when you are done:
cd ~/boot-time-labs/rootfs/buildroot/
make menuconfig
In Buildroot’s configuration interface:
• Set Enabled encoders to rawvideo
• Set Enabled decoders to mjpeg
• Empty the Enabled muxers, Enabled demuxers, Enabled parsers, Enabled bitstreams
and Enabled protocols settings.
• Set Enabled filters to scale
• For Enable output devices and Enable input devices, individual device selection is not
possible, so we will configure devices manually in the next field. So, empty such settings.
• Set Additional parameters for ./configure to
--enable-indev=v4l2 --enable-outdev=fbdev
Now, let’s get Buildroot to recompile ffmpeg, taking our new settings into account:
make ffmpeg-dirclean
make
You can now fill the below table, reusing data from the previous lab:
Total rootfs size /usr/bin/ffmpeg size
Initial configuration
Reduced configuration
Difference (percentage)
Do you expect to see differences in execution time, with a reduced configuration? Run the
measures with time again, and compare with what you got during the previous lab.
If the results surprise you, don’t hesitate to show them to your instructor ask for her/his opinion.
Looking at the ffmpeg log which displays enabled configuration settings, try to find further
configuration switches which can be removed without breaking the player in our particular
system.
Something that can help too is to inspect the whole root filesystem, looking for files that don’t
seem necessary.
The easiest way is to do this on the workstation:
sudo apt install tree
cd /media/$USER/rootfs
tree
The tree command really makes this task easier. For the moment, don’t bother about Busybox
and system files. They will be addressed later. Better focus on files and libraries related to
ffmpeg.
With a build system like Buildroot, it’s easy to add performance analysis and debugging utilities.
Configure Buildroot to add strace to your root filesystem. You will find the corresponding
configuration option in Package selection for the target and then in Debugging, profiling
and benchmark.
Run Buildroot and reflash your device as usual.
With strace’s help, you can already have a pretty good understanding of how your application
spends its time. You can see all the system calls that it makes and knowing the application,
you can guess in which part of the code it is at a given time.
You can also spot unnecessary attempts to open files that do not exist, multiple accesses to
the same file, or more generally things that the program was not supposed to do. All these
correspond to opportunities to fix and optimize your application.
Once the board has booted, run strace on the video player application:
strace -tt -f -o strace.log ffmpeg -f video4linux2 -video_size 544x288 \
-input_format mjpeg -i /dev/video0 -pix_fmt rgb565le -f fbdev /dev/fb0
Also have strace generate a summary:
strace -c -f -o strace-summary.log ffmpeg ...
Take some time to read strace.log7 , and see everything that the program is doing. Don’t
hesitate to lookup the ioctl codes on the Internet to have an idea about what’s going on between
the player, he camera and the display.
Also have a look at strace-summary.log. You will find the number of errors trying to open files
that do not exist, and where most time is spent, for example. You can also count the number
of memory allocations (using the mmap2 system call).
the vi editor becomes useful. See https://bootlin.com/doc/command_memento.pdf for a basic command summary.
Otherwise, you can use the more rudimentary more command. You can also copy the files to your PC, using a
USB drive, for example.
8 https://buildroot.org/downloads/manual/manual.html#full-rebuild
Measuring
Remember that the first step in optimization work is measuring elapsed time. We need to know
which parts of the init scripts are the biggest time consumers.
Check and write down the initial size of the root filesystem archive.
To compile and use bootchart on your workstation, you first need to install a few Java packages:
sudo apt install ant openjdk-11-jdk
Note that ant is a Java based build tool like make.
Now, get the bootchart source code for version 0.9 from http://www.bootchart.org/9 , compile
it and use bootchart to generate the boot chart:
tar xf bootchart-0.9.tar.bz2
cd bootchart-0.9
ant
java -jar bootchart.jar ~/boot-time-labs/rootfs/bootlog.tgz
This produces the bootlog.png image which you can visualize to study and optimize your startup
sequence:
xdg-open bootlog.png
xdg-open is a universal way of opening a file with a given MIME type with the associated
application as registered in the system. According to the exact Ubuntu flavor that you are
using, it will run the particular image viewer available in that particular flavor.
The above graph shows several system processes running during the startup process, consuming
CPU time, probably delaying the execution of our application (see the color bars showing when
a task is using the CPU).
9 Don’t try to get the bootchart package supplied by Ubuntu instead. While it has similar functionality, it
looks like a completely unrelated piece of software. To confirm this, it has no dependency whatsoever on Java
packages.
In the general case, there will be services that you want to keep. At least, you could change the
order according to which services are started, by changing the alphanumeric order of startup
files (that’s reordering / postponing work).
Back to our case, we want to simplify our system as much as possible. In Buildroot’s configu-
ration interface, go to the System Configuration menu:
• Disable Enable root login with password and Run a getty (login prompt) after boot.
• Disable Purge unwanted locales
Don’t forget to remove bootchartd support as well and BR2_FEATURE_SEAMLESS_GZ which we
added earlier. Also disable ifupdown scripts in Networking applications. This allows to
remove the /etc/init.d/S40network script.
Before we update the filesystem, let’s make another experiment: boot your board, interrupt the
video player, and unmount /proc and /sys manually. Then, run the video player command
again, and you’ll see that ffmpeg runs perfectly without these virtual filesystems mounted.
This means we could directly run the video player as the init process! To keep the possibility
to interact with the board through a command line shell, we’re going to run a shell after the
video player.
So, let’s remove /etc/init.d/S50playvideo from the root filesystem overlay and replace it by
the /playvideo script provided in ~/boot-time-labs/rootfs/data/.
Regenerate your root filesystem:
make clean
make
Update and reflash your SD card. Reboot the board, but before booting, stop in U-Boot to
update the init program:
setenv bootargs console=ttyO0,115200n8 root=/dev/mmcblk0p2 rootwait rw init=/playvideo
saveenv
boot
Note that the rw setting is going to be important, as it will make Linux mount the root filesystem
in read/write mode, which allows to record the access time of each file.
If the video played ran fine as expected, you should now be in a shell in the serial console.
Type the sync command to flush the filesystem, remove the SD card and insert it on your PC
again. Go to /media/$USER/rootfs and run the below command:
find . -atime -100 -type f
This lists the regular files that were not accessed during the boot sequence we’ve just executed:
./usr/lib/os-release
./usr/share/ffmpeg/libvpx-720p50_60.ffpreset
./usr/share/ffmpeg/libvpx-360p.ffpreset
./usr/share/ffmpeg/libvpx-720p.ffpreset
./usr/share/ffmpeg/ffprobe.xsd
./usr/share/ffmpeg/libvpx-1080p.ffpreset
./usr/share/ffmpeg/libvpx-1080p50_60.ffpreset
./usr/share/udhcpc/default.script
./bin/busybox
./lib/libatomic.so.1.2.0
./lib/libgcc_s.so.1
./etc/hostname
./etc/profile.d/umask.sh
./etc/init.d/S02klogd
./etc/init.d/S20urandom
./etc/init.d/rcS
./etc/init.d/rcK
./etc/init.d/S01syslogd
./etc/group
./etc/protocols
./etc/inittab
./etc/passwd
./etc/services
./etc/fstab
./etc/shadow
./etc/hosts
./etc/profile
./etc/issue
./etc/shells
How does it work?
-atime -100 finds all the files which last access time is less than 100 minutes ago, actually when
we extracted the archive. Why doesn’t it find the files which were actually accessed during the
boot sequence? That’s because the board doesn’t have a clock set and its date got back to
January 1st, 1970. You can check the date of such a file:
stat etc/init.d/playvideo
File: etc/init.d/playvideo
Size: 300 Blocks: 8 IO Block: 4096 regular file
Device: b302h/45826d Inode: 376 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 1000/ mike) Gid: ( 1000/ mike)
Access: 1970-01-01 01:00:02.250000000 +0100
Modify: 2019-05-21 22:29:21.348835786 +0200
Change: 2019-05-21 22:29:21.748834877 +0200
Birth: -
A limitation is that it doesn’t work with symbolic links and directories, so we don’t know whether
a symbolic link or directory was accessed or not. That’s why we’re keeping only regular files
(-type f).
There’s a discrepancy too with /bin/busybox which was for sure accessed, but through a sym-
bolic link. We will have to remove it from the list.
If we had a board with a correct date, we would have still been able to use this technique, but
this time only looking for files last accessed more than a few minutes ago (-atime +5).
We can now implement a Buildroot Post build script that will eliminate such files in the target
directory. That’s much easier than tweaking the recipes that generated these files, and this can
be adapted to each case (each case is special, while recipes should be generic).
So, in the Buildroot directory, create a board/beaglecam/post-fakeroot.sh script that remov-
ing all the above files (except /bin/busybox):
#!/bin/sh
TARGET_DIR=$1
cd $TARGET_DIR
rm -rf \
./usr/lib/os-release \
./usr/share/ffmpeg/libvpx-720p50_60.ffpreset \
./usr/share/ffmpeg/libvpx-360p.ffpreset \
...
Don’t forget to make this script executable!
The last thing to do is to configure BR2_ROOTFS_POST_FAKEROOT_SCRIPT to board/beaglecam/
post-fakeroot.sh10 .
Rerun Buildroot and check that your root filesystem was simplified as expected:
make
See what’s left in the final archive. Actually, we propose to be more aggressive and directly
remove the entire /etc directory which we shouldn’t need any more. Do the same for /root,
/tmp, /var, /media, /mnt, /opt, /run and the /lib32 link. We’re keeping the /proc and /sys
directories in case we need to mount the corresponding filesystems.
Why doing this so aggressively for files we won’t access? On an ext4 filesystem, files that are
not accessed may not do any harm if they are not used, except perhaps marginally in terms of
mounting time (if the filesystem is unnecessarily big). However, if we store the root filesystem
in an Initramfs embedded in the kernel binary, every byte counts as it will make the kernel to
load bigger.
Update your root filesystem, remove the rw kernel parameter from bootargs in U-Boot (better
to keep the root filesystem mounted read-only as we don’t cleanly shut down the system) and
check that your system still boots fine.
Check and write down the new size of the root filesystem archive.
While we’re simplifying the root filesystem, it’s time to reduce the configuration of Busybox, to
only contain the features we need in our system.
Before we do this, check the size of the busybox executable in your root filesystem.
Buildroot helps us to configure BusyBox by providing a make busybox-menuconfig command,
but it will be tedious to use because we will have to unselect countless options.
Here’s another way. Go to output/build/busybox-1.29.3/ and run make allnoconfig, followed
by make menuconfig. You’ll see that most options are unselected!
Just select the below options, based on what we have in our /playvideo script:
• In Settings:
– Enable Support files > 2 GB. Without this, BusyBox will fail to compile (at least
with our toolchain)
• In Shells:
10 We could have used Buildroot’s post build scripts (BR2_ROOTFS_POST_BUILD_SCRIPT), but that would have been
too early, as the fakeroot scripts make some customizations on files like /etc/inittab, which we want to remove.
– Enable Use internal glob() implementation, even if you don’t select ash. Other-
wise, compiling hush will fail.
– Select the hush shell
– Keep only Support if/then/elif/else/fi and Support for, while and until loops,
• In Coreutils:
– Support for the sleep command, with support for fractional arguments.
– Support for the echo command, without additional options.
– Enable test and test as [
– Disable Extend test to 64 bit
Now get back to the main Buildroot directory and copy this new configuration:
cp output/build/busybox-1.29.3/.config board/beaglecam/busybox.config
Then, run make menuconfig and set BR2_PACKAGE_BUSYBOX_CONFIG to this new file. In System
configuration, also set Init system to None. Otherwise Buildroot will enable Busybox init
into your configuration.
Run make, and update your SD card. Check the new size of /bin/busybox!
Also write down the new size of the root filesystem tar archive.
Since we now have only two executables (busybox and ffmpeg), let’s explore the possibility to
switch to static executables, hoping to reduce filesystem size by not having to copy the entire
shared libraries.
In Buildroot’s configuration interface, find and set BR2_STATIC_LIBS=y.
Run make clean and make.
Run tar tvf output/images/rootfs.tar to find out directories which are now empty and there-
fore can now be removed. Add such directories to your post-fakeroot script and regenerate the
filesystem again. This should save a few extra bytes.
Once more, write down the new size of the root filesystem tar archive. You should observe
substancial space reduction. Let’s keep this option!
Testing
Filesystem optimizations
See what best filesystem options are in terms of boot time
During this lab, we will compare 3 ways of accessing the root filesystem
• Booting from an ext4 filesystem
• Booting from a SquashFS filesystem
• Booting from an initramfs
Initramfs tests
Booting from an initramfs is completely different. The strong advantage here is that the root
filesystem will be extracted from an archive inside the kernel binary. So instead of several reads
from the MMC, we will just have one reading (though bigger) the kernel binary. This can work
well with small root filesystems as ours.
Booting the kernel should be faster too, as we won’t need the MMC and filesystem drivers at
all. So, let’s configure the kernel accordingly.
Now, you should be able to extract the measures and write them down in the table above.
If your tests run the same way ours did, the initramfs approach should win by a few tens of
milliseconds.
Also measure the size of your zImage file at write it in the table at the top of this chapter, to
compare with your initial kernel.
Let’s choose this solution with an initramfs. There are still many things we can accelerate during
the execution of the bootloader and execution.
Kernel optimizations
Measure kernel boot components and optimize the kernel boot time
Measuring
We are going to use the kernel initcall_debug functionality.
Our default kernel already has the configuration settings that we need:
• CONFIG_PRINTK_TIME=y, to add a timestamp to each kernel message.
• CONFIG_LOG_BUF_SHIFT=16, to have a big enough kernel ring buffer.
That’s not sufficient. We also need the output of the dmesg command.
We are going to make a few changes to the root filesystem. To save time later going back to the
initial Buildroot configuration, make a copy of the buildroot/ directory to buildroot-dmesg/:
rsync -aH buildroot/ buildroot-dmesg/
In this new directory, add support for dmesg command in BusyBox, and add the below line after
the ffmpeg file in the playvideo scripts:
dmesg > /dev/console
Run Buildroot again, and update your ~/boot-time-labs/rootfs/rootfs directory again. Com-
pile your kernel again to to update the zImage with this root filesystem.
Now, let’s enable initcall_debug in kernel parameters. Go to the U-Boot command line, and
add the below settings to the kernel command line 11 , and boot your system:
setenv bootargs ${bootargs} initcall_debug printk.time=1
boot
Boot the board with the new kernel image. If everything went well, you can now copy and
paste the special dmesg output to a ~/boot-time-labs/kernel/initcall_debug.log file on your
workstation.
In ~/boot-time-labs/kernel (at least where the kernel sources are), run the following command
to generate a boot graph:
linux/scripts/bootgraph.pl initcall_debug.log > boot.svg
You can view the boot graph with the inkscape vector graphics editor:
sudo apt install inkscape
inkscape boot.svg
0.23
0.34
0.46
0.57
0.69
0.8
0.92
1.03
1.15
1.26
1.38
1.49
1.61
1.72
1.84
tracer_init_tracefs
chr_dev_init
populate_rootfs
sysc_init
serial8250_init
omap8250_platform_driver_init
tilcdc_drm_init
mtdoops_init
fixed_mdio_bus_init
cpsw_driver_init
am335x_child_init
i2c_dev_init
uvc_init
ledtrig_cpu_init
oprofile_init
xfrm_user_init
inet6_init
sit_init
packet_init
ipsec_pfkey_init
init_dns_resolver
thumbee_init
swp_emulation_init
__omap2_common_pm_late_init
__sr_class3_init
load_system_certificate_list
clk_debug_init
deferred_probe_initcall
rtc_hctosys
regulator_init_complete
11 Don’t save these settings with saveenv. We will just need them once.
Now review the longest initcalls in detail. Each label is the name of a function in the kernel
sources. Try to find out in which source file each function is defined12 , and what each driver
corresponds to.
Then, you can look the source code and:
• See whether you need the corresponding driver or feature at all. If that’s the case, just
disable it.
• Otherwise, try look for obvious causes which would explain the very long execution time:
delay loops (look for delay, parameters which can reduce probe time but are not used,
etc).
• There could also be features than could be postponed. However, in our special case, we
should only need to keep kernel features that we need to run our video player. However,
in a real life system, the boot graph could indeed reveal drivers which could be compiled
as modules and loaded later.
Recompile and reboot the kernel, updating the boot graph until there is nothing left that you
can do.
When you are done exploiting data from the boot graphs, you can remove dmesg support from
BusyBox and remove this command too from playvideo. Update your root filesystem and then
kernel so that we get back to the original situation. We no longer need initcall_debug.
Bootloader optimizations
Reduce bootloader execution time
In this lab, we will run the final stage of boot time reduction:
• Improving the efficiency of the bootloader by optimizing its usage
• Recompiling the bootloader with the minimum set of options, and even completely skip
the second stage of the bootloader.
So, edit the partition table of your new SD card, and create the first partition in the same way
as when you prepared your original SD card. Then, copy the files over.
You can now go ahead and make tests again, and fill the table with your latest results:
Step Duration Description
U-Boot SPL Between U-Boot SPL 2019.01 and U-Boot 2019.01
U-Boot Between U-Boot 2019.01 and Starting kernel
Kernel + Init scripts Between Starting kernel and Starting ffmpeg
Application Between Starting ffmpeg and First frame decoded
Total
Going further
There are several things we can do to try to further optimize things:
• As our storage is now faster, it can be interesting to explore the various kernel compression
schemes again. The optimum solution may be a different one.
• Look for a solution to eliminate the delay detecting the USB webcam.
• If you don’t manage to get rid of this delay, at least take advantage of this spare time to
show signs of life on the screen, by implementing a splashscreen. You can even implement
an animation. One thing you can do is use BusyBox’s fbsplash tool, to first show an
image on the framebuffer, and then even show a progress bar (knowing how much time
you have to wait for the camera to be ready).