Linux Kernel Slides
Linux Kernel Slides
Corrections, suggestions, contributions and translations are welcome! embedded Linux and kernel engineering
Send them to [email protected]
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 1/437
Linux kernel and driver development training
▶ These slides are the training materials for Bootlin’s Linux kernel
and driver development training course.
▶ If you are interested in following this course with an experienced
Bootlin trainer, we offer:
• Public online sessions, opened to individual registration. Dates
announced on our site, registration directly online.
• Dedicated online sessions, organized for a team of engineers
from the same company at a date/time chosen by our customer.
• Dedicated on-site sessions, organized for a team of engineers
from the same company, we send a Bootlin trainer on-site to
deliver the training. Icon by Eucalyp, Flaticon
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 2/437
About Bootlin
About Bootlin
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 3/437
Bootlin introduction
▶ Engineering company
• In business since 2004
• Before 2018: Free Electrons
▶ Team based in France and Italy
▶ Serving customers worldwide
▶ Highly focused and recognized expertise
• Embedded Linux
• Linux kernel
• Embedded Linux build systems
▶ Strong open-source contributor
▶ Activities
• Engineering services
• Training courses
▶ https://bootlin.com
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 4/437
Bootlin engineering services
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 5/437
Bootlin training courses
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 6/437
Bootlin, an open-source contributor
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 7/437
Bootlin on-line resources
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 8/437
Generic course information
Generic course
information
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 9/437
Supported hardware
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 10/437
Labs proposed on another platform
You can also run the labs of this course on the Beagleplay
board.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 11/437
Shopping list: hardware for this course
▶ USB Serial Cable - 3.3 V - Male ends (for serial labs, two if possible) 4
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 12/437
Training quiz and certificate
▶ You have been given a quiz to test your knowledge on the topics covered by the
course. That’s not too late to take it if you haven’t done it yet!
▶ At the end of the course, we will submit this quiz to you again. That time, you
will see the correct answers.
▶ It allows Bootlin to assess your progress thanks to the course. That’s also a kind
of challenge, to look for clues throughout the lectures and labs / demos, as all the
answers are in the course!
▶ Another reason is that we only give training certificates to people who achieve at
least a 50% score in the final quiz and who attended all the sessions.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 13/437
Participate!
During the lectures...
▶ Don’t hesitate to ask questions. Other people in the audience may have similar
questions too.
▶ Don’t hesitate to share your experience too, for example to compare Linux with
other operating systems you know.
▶ Your point of view is most valuable, because it can be similar to your colleagues’
and different from the trainer’s.
▶ In on-line sessions
• Please always keep your camera on!
• Also make sure your name is properly filled.
• You can also use the ”Raise your hand” button when you wish to ask a question but
don’t want to interrupt.
▶ All this helps the trainer to engage with participants, see when something needs
clarifying and make the session more interactive, enjoyable and useful for everyone.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 14/437
Collaborate!
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 15/437
Practical lab - Training Setup
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 16/437
Linux Kernel Introduction
Linux Kernel
Introduction
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 17/437
Origin
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 18/437
Linux kernel in the system
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 19/437
Linux kernel main roles
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 20/437
System calls
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 21/437
Pseudo filesystems
▶ Linux makes system and kernel information available in user space through
pseudo filesystems, sometimes also called virtual filesystems
▶ Pseudo filesystems allow applications to see directories and files that do not exist
on any real storage: they are created and updated on the fly by the kernel
▶ The two most important pseudo filesystems are
• proc, usually mounted on /proc:
Operating system related information (processes, memory management
parameters...)
• sysfs, usually mounted on /sys:
Representation of the system as a tree of devices connected by buses. Information
gathered by the kernel frameworks managing these devices.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 22/437
Linux Kernel Introduction
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 23/437
Location of the official kernel sources
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 24/437
Linux versioning scheme
▶ Until 2003, there was a new “stabilized” release branch of Linux every 2 or 3 years
(2.0, 2.2, 2.4). Development branches took 2-3 years to be merged (too slow!).
▶ Since 2003, there is a new official release of Linux about every 10 weeks:
• Versions 2.6 (Dec. 2003) to 2.6.39 (May 2011)
• Versions 3.0 (Jul. 2011) to 3.19 (Feb. 2015)
• Versions 4.0 (Apr. 2015) to 4.20 (Dec. 2018)
• Versions 5.0 (Mar. 2019) to 5.19 (July 2022)
• Version 6.0 was released in Oct. 2022.
▶ Features are added to the kernel in a progressive way. Since 2003, kernel
developers have managed to do so without having to introduce a massively
incompatible development branch.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 25/437
Linux development model
▶ Each new release starts with a two-week merge window for new features
▶ Follow about 8 release candidates (one week each)
▶ Until adoption of a new official release.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 26/437
Need to further stabilize the official kernels
▶ Issue: bug and security fixes only merged into the master branch, need to update
to the latest kernel to benefit from them.
▶ Solution: a stable maintainers team goes through all the patches merged into
Torvald’s tree and backports the relevant ones into their stable branches for at
least a few months.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 27/437
Location of the stable kernel sources
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 28/437
Need for long term support
▶ Issue: bug and security fixes only released for most recent kernel versions.
▶ Solution: the last release of each year is made an LTS (Long Term Support)
release, and is supposed to be supported (and receive bug and security fixes) for
up to 6 years.
▶ Example at Google: starting from Android O (2017), all new Android devices have
to run such an LTS kernel.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 29/437
Need for even longer term support
▶ You could also get long term support from a commercial embedded Linux
provider.
• Wind River Linux can be supported for up to 15 years.
• Ubuntu Core can be supported for up to 10 years.
▶ ”If you are not using a supported distribution kernel, or a stable / longterm kernel,
you have an insecure kernel” - Greg KH, 2019
Some vulnerabilities are fixed in stable without ever getting a CVE.
▶ The Civil Infrastructure Platform project is an industry / Linux Foundation effort
to support much longer (at least 10 years) selected LTS versions (currently 4.4,
4.19, 5.10 and 6.1) on selected architectures. See
https://wiki.linuxfoundation.org/civilinfrastructureplatform/start.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 30/437
Location of non-official kernel sources
▶ Many chip vendors supply their own kernel sources
• Focusing on hardware support first
• Can have a very important delta with mainline Linux
• Sometimes they break support for other platforms/devices without caring
• Useful in early phases only when mainline hasn’t caught up yet (many vendors invest
in the mainline kernel at the same time)
• Suitable for PoC, not suitable for products on the long term as usually no updates
are provided to these kernels
• Getting stuck with a deprecated system with broken software that cannot be
updated has a real cost in the end
▶ Many kernel sub-communities maintain their own kernel, with usually newer but
fewer stable features, only for cutting-edge development
• Architecture communities (ARM, MIPS, PowerPC, etc)
• Device drivers communities (I2C, SPI, USB, PCI, network, etc)
• Other communities (filesystems, memory-management, scheduling, etc)
• Not suitable to be used in products
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 31/437
Linux kernel size and structure
▶ Linux v5.18 sources: close to 80k files, 35M lines, 1.3GiB
▶ But a compressed Linux kernel just sizes a few megabytes.
▶ So, why are these sources so big?
Because they include numerous device drivers, network protocols, architectures,
filesystems... The core is pretty small!
▶ As of kernel version v5.18 (in percentage of total number of lines):
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 33/437
Linux Kernel Introduction
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 34/437
Programming language
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 35/437
No C library
▶ The kernel has to be standalone and can’t use user space code.
▶ Architectural reason: user space is implemented on top of kernel services, not the
opposite.
▶ Technical reason: the kernel is on its own during the boot up phase, before it has
accessed a root filesystem.
▶ Hence, kernel code has to supply its own library implementations (string utilities,
cryptography, uncompression...)
▶ So, you can’t use standard C library functions in kernel code (printf(),
memset(), malloc(),...).
▶ Fortunately, the kernel provides similar C functions for your convenience, like
printk(), memset(), kmalloc(), ...
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 36/437
Portability
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 37/437
Linux kernel to user API/ABI stability
Linux kernel to user API
✔
☐ API stability is guaranteed, source code
is portable!
Linux v3.8
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 38/437
Linux internal API/ABI instability
Linux internal API
✘
☐ API stability is not guaranteed,
source code portability is not given
Linux internal API is not stable
▶ The source code of a driver is not portable across
versions
• In-tree drivers are updated by the developer proposing
the API change: works great for mainline code
• An out-of-tree driver compiled for a given version may
no longer compile or work on a more recent one Linux internal ABI
• See process/stable-api-nonsense for reasons why ✘
☐ no stable ABI over Linux kernel releases,
binaries are not portable
• The module loading utilities will perform this check compiled for
Linux v5.17
in Linux v5.18
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 39/437
Kernel memory constraints
▶ No memory protection
▶ The kernel doesn’t try to recover from attemps to access illegal memory locations.
It just dumps oops messages on the system console.
▶ Fixed size stack (8 or 4 KB). Unlike in user space, no mechanism was
implemented to make it grow. Don’t use recursion!
▶ Swapping is not implemented for kernel memory either
(except tmpfs which lives completely in the page cache and on swap)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 40/437
Linux kernel licensing constraints
▶ The Linux kernel is licensed under the GNU General Public License version 2
• This license gives you the right to use, study, modify and share the software freely
▶ However, when the software is redistributed, either modified or unmodified, the
GPL requires that you redistribute the software under the same license, with the
source code
• If modifications are made to the Linux kernel (for example to adapt it to your
hardware), it is a derivative work of the kernel, and therefore must be released under
GPLv2.
▶ The GPL license has been successfully enforced in courts:
https://en.wikipedia.org/wiki/Gpl-violations.org#Notable_victories
▶ However, you’re only required to do so
• At the time the device starts to be distributed
• To your customers, not to the entire world
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 41/437
Proprietary code and the kernel
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 42/437
Abusing the kernel licensing constraints
▶ There are some examples of ▶ The current trend is to hide the logic
proprietary drivers in the firmware or in userspace. The
• Nvidia uses a wrapper between their GPL kernel driver is almost empty and
drivers and the kernel either:
• They claim the drivers could be used
• Blindly writes an incoming flow of
with a different OS with another
bytes in the hardware
wrapper • Exposes a huge MMIO region to
• Unclear whether it makes it legal or
userspace through mmap
not
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 43/437
Advantages of GPL drivers
▶ You don’t have to write your driver from scratch. You can reuse code from similar
free software drivers.
▶ Your drivers can be freely and easily shipped by others (for example by Linux
distributions or embedded Linux build systems).
▶ Legal certainty, you are sure that a GPL driver is fine from a legal point of view.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 44/437
Advantages of mainlining your kernel drivers
▶ The community, reviewers and maintainers will review your code before accepting
it, offering you the opportunity to enhance it and understand better the internal
APIs.
▶ Once accepted, you will get cost-free bug and security fixes, support for new
features, and general improvements.
▶ Your work will automatically follow the API changes.
▶ Accessing your code will be much easier for users.
▶ Your code will remain valid no matter the kernel version.
This will for sure reduce your maintenance and support work
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 45/437
User space device drivers 1/2
▶ The kernel provides some mechanisms to access hardware from userspace:
• USB devices with libusb, https://libusb.info/
• SPI devices with spidev, spi/spidev
• I2C devices with i2cdev, i2c/dev-interface
• GPIOs with libgpiod, https://libgpiod.readthedocs.io
• Memory-mapped devices with UIO, including interrupt handling,
driver-api/uio-howto
▶ These solutions can only be used if:
• There is no need to leverage an existing kernel subsystem such as the networking
stack or filesystems.
• There is no need for the kernel to act as a “multiplexer” for the device: only one
application accesses the device.
▶ Certain classes of devices like printers and scanners do not have any kernel
support, they have always been handled in user space for historical reasons.
▶ Otherwise this is not how the system should be architectured. Kernel drivers
should always be preferred!
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 46/437
User space device drivers 2/2
▶ Advantages
• No need for kernel coding skills.
• Drivers can be written in any language, even Perl!
• Drivers can be kept proprietary.
• Driver code can be killed and debugged. Cannot crash the kernel.
• Can use floating-point computation.
• Potentially higher performance, especially for memory-mapped devices, thanks to the
avoidance of system calls.
▶ Drawbacks
• The kernel has no longer access to the device.
• None of the standard applications will be able to use it.
• Cannot use any hardware abstraction or software helpers from the kernel
• Need to adapt applications when changing the hardware.
• Less straightforward to handle interrupts: increased latency.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 47/437
Practical lab - Kernel Source Code - Exploring
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 48/437
Linux Kernel Usage
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 49/437
Linux Kernel Usage
Kernel configuration
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 50/437
Kernel configuration
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 51/437
Kernel configuration and build system
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 52/437
Specifying the target architecture
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 53/437
Choosing a compiler
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 54/437
Specifying ARCH and CROSS_COMPILE
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 55/437
Initial configuration
Difficult to find which kernel configuration will work with your hardware and root
filesystem. Start with one that works!
▶ Desktop or server case:
• Advisable to start with the configuration of your running kernel:
cp /boot/config-`uname -r` .config
▶ Embedded platform case:
• Default configurations stored in-tree as minimal configuration files (only listing
settings that are different with the defaults) in arch/<arch>/configs/
• make help will list the available configurations for your platform
• To load a default configuration file, just run make foo_defconfig (will erase your
current .config!)
On ARM 32-bit, there is usually one default configuration per CPU family
On ARM 64-bit, there is only one big default configuration to customize
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 56/437
Create your own default configuration
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 57/437
Built-in or module?
▶ The kernel image is a single file, resulting from the linking of all object files that
correspond to features enabled in the configuration
• This is the file that gets loaded in memory by the bootloader
• All built-in features are therefore available as soon as the kernel starts, at a time
where no filesystem exists
▶ Some features (device drivers, filesystems, etc.) can however be compiled as
modules
• These are plugins that can be loaded/unloaded dynamically to add/remove features
to the kernel
• Each module is stored as a separate file in the filesystem, and therefore access
to a filesystem is mandatory to use modules
• This is not possible in the early boot procedure of the kernel, because no filesystem
is available
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 58/437
Kernel option types
There are different types of options, defined in Kconfig files:
▶ bool options, they are either
• true (to include the feature in the kernel) or
• false (to exclude the feature from the kernel)
▶ tristate options, they are either
• true (to include the feature in the kernel image) or
• module (to include the feature as a kernel module) or
• false (to exclude the feature)
▶ int options, to specify integer values
▶ hex options, to specify hexadecimal values
Example: CONFIG_PAGE_OFFSET=0xC0000000
▶ string options, to specify string values
Example: CONFIG_LOCALVERSION=-no-network
Useful to distinguish between two kernels built from different options
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 59/437
Kernel option dependencies
Enabling a network driver requires the network stack to be enabled, therefore
configuration symbols have two ways to express dependencies:
▶ depends on dependency: ▶ select dependency:
config B config A
depends on A select B
• B is not visible until A is • When A is enabled, B is enabled too (and
enabled cannot be disabled manually)
• Works well for dependency • Should preferably not select symbols with
chains depends on dependencies
• Used to declare hardware features or select
libraries
config SPI_ATH79
tristate "Atheros AR71XX/AR724X/AR913X SPI controller driver"
depends on ATH79 || COMPILE_TEST
select SPI_BITBANG
help
This enables support for the SPI controller present on the
Atheros AR71XX/AR724X/AR913X SoCs.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 60/437
Kernel configuration details
#
# CD-ROM/DVD Filesystems
#
▶ The configuration is stored in the .config file at CONFIG_ISO9660_FS=m
the root of kernel sources CONFIG_JOLIET=y
• Simple text file, CONFIG_PARAM=value CONFIG_ZISOFS=y
• Options are grouped by sections and are prefixed CONFIG_UDF_FS=y
with CONFIG_ # end of CD-ROM/DVD Filesystems
• ”No” value is encoded as
#
# CONFIG_FOO is not set
• Included by the top-level kernel Makefile # DOS/FAT/EXFAT/NT Filesystems
#
• Typically not edited by hand because of the
CONFIG_FAT_FS=y
dependencies CONFIG_MSDOS_FS=y
# CONFIG_VFAT_FS is not set
CONFIG_FAT_DEFAULT_CODEPAGE=437
# CONFIG_EXFAT_FS is not set
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 61/437
xconfig
make xconfig
▶ A graphical interface to configure the
kernel.
▶ File browser: easy to load
configuration files
▶ Search interface to look for
parameters ([Ctrl] + [f])
▶ Required Debian/Ubuntu packages:
qtbase5-dev on Ubuntu 22.04
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 62/437
menuconfig
make menuconfig
▶ Useful when no graphics are available.
Very efficient interface.
▶ Same interface found in other tools:
BusyBox, Buildroot...
▶ Convenient number shortcuts to jump
directly to search results.
▶ Required Debian/Ubuntu packages:
libncurses-dev
▶ Alternative: make nconfig
(now also has the number shortcuts)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 63/437
Kernel configuration options
You can switch from one tool to another, they all load/save the same .config file,
and show the same set of options
Compiled as a module:
CONFIG_ISO9660_FS=m
Statically built:
CONFIG_UDF_FS=y
Values in resulting .config file Parameter values as displayed by xconfig Parameter values as displayed by menuconfig
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 64/437
make oldconfig
make oldconfig
▶ Useful to upgrade a .config file from an earlier kernel release
▶ Asks for values for new parameters.
▶ ... unlike make menuconfig and make xconfig which silently set default values
for new parameters.
If you edit a .config file by hand, it’s useful to run make oldconfig afterwards, to set
values to new parameters that could have appeared because of dependency changes.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 65/437
Undoing configuration changes
A frequent problem:
▶ After changing several kernel configuration settings, your kernel no longer works.
▶ If you don’t remember all the changes you made, you can get back to your
previous configuration:
$ cp .config.old .config
▶ All the configuration tools keep this .config.old backup copy.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 66/437
Linux Kernel Usage
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 67/437
Kernel compilation
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 68/437
Kernel compilation results
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 69/437
Kernel installation: native case
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 70/437
Kernel installation: embedded case
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 71/437
Module installation: native case
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 72/437
Module alias: modules.alias
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 73/437
Module installation: embedded case
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 74/437
Kernel cleanup targets
Cleaning targets:
clean - Remove most generated files but keep the config and
enough build support to build external modules
mrproper - Remove all generated files + config + various backup files
distclean - mrproper + remove editor backup and patch files
▶ If you are in a git tree, remove all files not tracked (and ignored) by git:
git clean -fdx
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 75/437
Kernel building overview
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 76/437
Linux Kernel Usage
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 77/437
Hardware description
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 78/437
Booting with U-Boot
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 79/437
Kernel command line
▶ In addition to the compile time configuration, the kernel behavior can be adjusted
with no recompilation using the kernel command line
▶ The kernel command line is a string that defines various arguments to the kernel
• It is very important for system configuration
• root= for the root filesystem (covered later)
• console= for the destination of kernel messages
• Example: console=ttyS0 root=/dev/mmcblk0p2 rootwait
• Many more exist. The most important ones are documented in
admin-guide/kernel-parameters in kernel documentation.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 80/437
Passing the kernel command line
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 81/437
Kernel log
▶ The kernel keeps its messages in a circular buffer in memory
• The size is configurable using CONFIG_LOG_BUF_SHIFT
▶ When a module is loaded, related information is available in the kernel log.
▶ Kernel log messages are available through the dmesg command (diagnostic
message)
▶ Kernel log messages are also displayed on the console pointed by the console=
kernel command line argument
• Console messages can be filtered by level using the loglevel parameter:
loglevel= allows to filter messages displayed on the console based on priority
ignore_loglevel (same as loglevel=8) will lead to all messages being printed
quiet (same as loglevel=0) prevents any message from being displayed on the
console
• Example: console=ttyS0 root=/dev/mmcblk0p2 loglevel=5
▶ It is possible to write to the kernel log from user space:
echo "<n>Debug info" > /dev/kmsg
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 82/437
Practical lab - Kernel compiling and booting
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 83/437
Linux Kernel Usage
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 84/437
Advantages of modules
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 85/437
Module utilities: extracting information
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 86/437
Module utilities: loading
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 87/437
Understanding module loading issues
▶ When loading a module fails, insmod often doesn’t give you enough details!
▶ Details are often available in the kernel log.
▶ Example:
$ sudo insmod ./intr_monitor.ko
insmod: error inserting './intr_monitor.ko': -1 Device or resource busy
$ dmesg
[17549774.552000] Failed to register handler for irq channel 2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 88/437
Module utilities: removals
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 89/437
Passing parameters to modules
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 90/437
Check module parameter values
How to find/edit the current values for the parameters of a loaded module?
▶ Check /sys/module/<name>/parameters.
▶ There is one file per parameter, containing the parameter value.
▶ Also possible to change parameter values if these files have write permissions
(depends on the module code).
▶ Example:
echo 0 > /sys/module/usb_storage/parameters/delay_use
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 91/437
Developing kernel modules
Developing kernel
modules
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 92/437
Hello module 1/2
// SPDX-License-Identifier: GPL-2.0
/* hello.c */
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
module_init(hello_init);
module_exit(hello_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Greeting module");
MODULE_AUTHOR("William Shakespeare");
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 93/437
Hello module 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 94/437
Hello module explanations
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 95/437
Symbols exported to modules 1/2
▶ From a kernel module, only a limited number of kernel functions can be called
▶ Functions and variables have to be explicitly exported by the kernel to be visible
to a kernel module
▶ Two macros are used in the kernel to export functions and variables:
• EXPORT_SYMBOL(symbolname), which exports a function or variable to all modules
• EXPORT_SYMBOL_GPL(symbolname), which exports a function or variable only to GPL
modules
• Linux 5.3: contains the same number of symbols with EXPORT_SYMBOL() and
symbols with EXPORT_SYMBOL_GPL()
▶ A normal driver should not need any non-exported function.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 96/437
Symbols exported to modules 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 97/437
Module license
▶ Several usages
• Used to restrict the kernel functions that the module can use if it isn’t a GPL
licensed module
Difference between EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL()
• Used by kernel developers to identify issues coming from proprietary drivers, which
they can’t do anything about (“Tainted” kernel notice in kernel crashes and oopses).
• See admin-guide/tainted-kernels for details about tainted flag values.
• Useful for users to check that their system is 100% free (for the kernel, check
/proc/sys/kernel/tainted; run vrms to check installed packages)
▶ Values
• GPL compatible (see include/linux/license.h: GPL, GPL v2,
GPL and additional rights, Dual MIT/GPL, Dual BSD/GPL, Dual MPL/GPL)
• Proprietary
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 98/437
Compiling a module
Two solutions
▶ Out of tree, when the code is outside of the kernel source tree, in a different
directory
• Not integrated into the kernel configuration/compilation process
• Needs to be built separately
• The driver cannot be built statically, only as a module
▶ Inside the kernel tree
• Well integrated into the kernel configuration/compilation process
• The driver can be built statically or as a module
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 99/437
Compiling an out-of-tree module 1/2
▶ The below Makefile should be reusable for any single-file out-of-tree Linux
module
▶ The source file is hello.c
▶ Just run make to build the hello.ko file
ifneq ($(KERNELRELEASE),)
obj-m := hello.o
else
KDIR := /path/to/kernel/sources
all:
<tab>$(MAKE) -C $(KDIR) M=$$PWD
endif
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 100/437
Compiling an out-of-tree module 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 102/437
New driver in kernel sources 1/2
config USB_SERIAL_NAVMAN
tristate "USB Navman GPS device"
depends on USB_SERIAL
help
To compile this driver as a module, choose M
here: the module will be called navman.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 103/437
New driver in kernel sources 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 104/437
Hello module with parameters 1/2
// SPDX-License-Identifier: GPL-2.0
/* hello_param.c */
#include <linux/init.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 105/437
Hello module with parameters 2/2
static int __init hello_init(void)
{
int i;
module_init(hello_init);
module_exit(hello_exit);
Thanks to Jonathan Corbet for the examples
Source code available on: https://github.com/bootlin/training-materials/blob/master/code/hello-param/hello_param.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 106/437
Declaring a module parameter
module_param(
name, /* name of an already defined variable */
type, /* standard types (different from C types) are:
* byte, short, ushort, int, uint, long, ulong
* charp: a character pointer
* bool: a bool, values 0/1, y/n, Y/N.
* invbool: the above, only sense-reversed (N = true). */
perm /* for /sys/module/<module_name>/parameters/<param>,
* 0: no such module parameter value file */
);
/* Example: drivers/block/loop.c */
static int max_loop;
module_param(max_loop, int, 0444);
MODULE_PARM_DESC(max_loop, "Maximum number of loop devices");
Modules parameter arrays are also possible with module_param_array().
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 107/437
Practical lab - Writing modules
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 108/437
Describing hardware devices
Describing hardware
devices
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 109/437
Describing hardware devices
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 110/437
Discoverable hardware
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 111/437
Describing hardware devices
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 112/437
Describing non-discoverable hardware
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 113/437
Describing non-discoverable hardware
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 113/437
Describing non-discoverable hardware
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 113/437
Device Tree: from source to blob
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 114/437
dtc example
$ cat foo.dts
/dts-v1/;
/ {
welcome = <0xBADCAFE>;
bootlin {
webinar = "great";
demo = <1>, <2>, <3>;
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 115/437
dtc example
$ cat foo.dts
/dts-v1/;
/ {
welcome = <0xBADCAFE>;
bootlin {
webinar = "great";
demo = <1>, <2>, <3>;
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 115/437
dtc example
/ { / {
welcome = <0xBADCAFE>; welcome = <0xbadcafe>;
bootlin {
webinar = "great"; bootlin {
demo = <1>, <2>, <3>; webinar = "great";
}; demo = <0x01 0x02 0x03>;
}; };
};
$ dtc -I dts -O dtb -o foo.dtb foo.dts
$ ls -l foo.dt*
-rw-r--r-- 1 thomas thomas 169 ... foo.dtb
-rw-r--r-- 1 thomas thomas 102 ... foo.dts
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 115/437
Where are Device Tree Sources located?
▶ Even though they are OS-agnostic, no central and OS-neutral place to host
Device Tree sources and share them between projects
• Often discussed, never done
▶ In practice, the Linux kernel sources can be considered as the canonical location
for Device Tree Source files
• arch/<ARCH>/boot/dts/<vendor>/
• arch/arm/boot/dts (on ARM 32 architecture before Linux 6.5)
• ≈ 4500 Device Tree Source files (.dts and .dtsi) in Linux as of 6.0.
▶ Duplicated/synced in various projects
• U-Boot, Barebox, TF-A
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 116/437
Device Tree base syntax
▶ Tree of nodes
▶ Nodes with properties
▶ Node ≈ a device or IP block
▶ Properties ≈ device characteristics
▶ Notion of cells in property values
▶ Notion of phandle to point to other
nodes
▶ dtc only does syntax checking, no
semantic validation
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 117/437
DT overall structure: simplified example
/ {
#address-cells = <1>;
#size-cells = <1>;
model = "TI AM335x BeagleBone Black";
compatible = "ti,am335x-bone-black", "ti,am335x-bone", "ti,am33xx";
cpus { ... };
memory@80000000 { ... };
chosen { ... };
ocp {
intc: interrupt-controller@48200000 { ... };
usb0: usb@47401300 { ... };
l4_per: interconnect@44c00000 {
i2c0: i2c@40012000 { ... };
};
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 118/437
DT overall structure: simplified example
/ {
cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu0: cpu@0 {
compatible = "arm,cortex-a8";
enable-method = "ti,am3352";
device_type = "cpu";
reg = <0>;
};
};
memory@0x80000000 {
device_type = "memory";
reg = <0x80000000 0x10000000>; /* 256 MB */
};
chosen {
bootargs = "";
stdout-path = &uart0;
};
ocp { ... };
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 118/437
DT overall structure: simplified example
/ {
cpus { ... };
memory@0x80000000 { ... };
chosen { ... };
ocp {
intc: interrupt-controller@48200000 {
compatible = "ti,am33xx-intc";
interrupt-controller;
#interrupt-cells = <1>;
reg = <0x48200000 0x1000>;
};
usb0: usb@47401300 {
compatible = "ti,musb-am33xx";
reg = <0x1400 0x400>, <0x1000 0x200>;
reg-names = "mc", "control";
interrupts = <18>;
dr_mode = "otg";
dmas = <&cppi41dma 0 0 &cppi41dma 1 0 ...>;
status = "okay";
};
l4_per: interconnect@44c00000 {
i2c0: i2c@40012000 { ... };
};
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 118/437
DT overall structure: simplified example
/ {
cpus { ... };
memory@0x80000000 { ... };
chosen { ... };
ocp {
compatible = "simple-pm-bus";
clocks = <&l3_clkctrl AM3_L3_L3_MAIN_CLKCTRL 0>;
clock-names = "fck";
#address-cells = <1>;
#size-cells = <1>;
l4_per: interconnect@44c00000 {
compatible = "ti,am33xx-l4-wkup", "simple-pm-bus";
reg = <0x44c00000 0x800>, <0x44c00800 0x800>,
<0x44c01000 0x400>, <0x44c01400 0x400>;
reg-names = "ap", "la", "ia0", "ia1";
#address-cells = <1>;
#size-cells = <1>;
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 118/437
DT overall structure: simplified example
/ {
cpus { ... };
memory@0x80000000 { ... };
chosen { ... };
ocp {
intc: interrupt-controller@48200000 { ... };
usb0: usb@47401300 { ... };
l4_per: interconnect@44c00000 {
i2c0: i2c@40012000 {
compatible = "ti,omap4-i2c";
#address-cells = <1>;
#size-cells = <0>;
reg = <0x0 0x1000>;
interrupts = <70>;
status = "okay";
pinctrl-names = "default";
pinctrl-0 = <&i2c0_pins>;
clock-frequency = <400000>;
baseboard_eeprom: eeprom@50 {
compatible = "atmel,24c256";
reg = <0x50>;
};
};
};
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 118/437
Device Tree inheritance
▶ Device Tree files are not monolithic, they can be split in several files, including
each other.
▶ .dtsi files are included files, while .dts files are final Device Trees
• Only .dts files are accepted as input to dtc
▶ Typically, .dtsi will contain
• definitions of SoC-level information
• definitions common to several boards
▶ The .dts file contains the board-level information
▶ The inclusion works by overlaying the tree of the including file over the tree of
the included file, according to the order of the #include directives.
▶ Allows an including file to override values specified by an included file.
▶ Uses the C pre-processor #include directive
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 119/437
Device Tree inheritance example
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 120/437
Inheritance and labels
Doing:
soc.dtsi
/ {
ocp {
uart0: serial@0 {
compatible = "ti,am3352-uart", "ti,omap3-uart";
reg = <0x0 0x1000>;
status = "disabled";
};
};
};
board.dts
#include "soc.dtsi"
/ {
ocp {
serial@0 {
status = "okay";
};
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 121/437
Inheritance and labels
board.dts board.dts
#include "soc.dtsi" #include "soc.dtsi"
/ { &uart0 {
ocp { status = "okay";
serial@0 { };
status = "okay";
};
}; → this solution is now often preferred
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 121/437
DT inheritance in Bone Black support
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 122/437
Device Tree design principles
▶ Describe hardware (how the hardware is), not configuration (how I choose to
use the hardware)
▶ OS-agnostic
• For a given piece of HW, Device Tree should be the same for U-Boot, FreeBSD or
Linux
• There should be no need to change the Device Tree when updating the OS
▶ Describe integration of hardware components, not the internals of hardware
components
• The details of how a specific device/IP block is working is handled by code in device
drivers
• The Device Tree describes how the device/IP block is connected/integrated with the
rest of the system: IRQ lines, DMA channels, clocks, reset lines, etc.
▶ Like all beautiful design principles, these principles are sometimes violated.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 123/437
The properties
Device tree properties can:
▶ Be generic and apply to most nodes
• Their meaning is usually described in one place: the core DT schema available at
https://github.com/devicetree-org/dt-schema.
• compatible, reg, #address-cells, etc
▶ Cover common consumer-provider relationships
• Their meaning is either described in the dt-schema GitHub repository or under
Documentation/devicetree/bindings.
• clocks, interrupts, regulators, etc
▶ Subsystem specific
• All devices of a certain class may use them, often starting with the class name
• spi-cpha, i2c-scl-internal-delay-ns, nand-ecc-engine, mac-address, etc
▶ Vendor/device specific
• To describe uncommon or very specific properties
• Always described in the device’s binding file and prefixed with <vendor>,
• ti,hwmods, xlnx,num-channels, nxp,tx-output-mode, etc
▶ Some of them are deprecated, watch out the bindings!
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 124/437
The compatible property
▶ Is a list of strings
• From the most specific to the least specific
▶ Describes the specific binding to which the node complies.
▶ It uniquely identifies the programming model of the device.
▶ Practically speaking, it is used by the operating system to find the appropriate
driver for this device.
▶ When describing real hardware, the typical form is vendor,model
▶ Examples:
• compatible = "arm,armv7-timer";
• compatible = "st,stm32mp1-dwmac", "snps,dwmac-4.20a";
• compatible = "regulator-fixed";
• compatible = "gpio-keys";
▶ Special value: simple-bus → bus where all sub-nodes are memory-mapped
devices
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 125/437
compatible property and Linux kernel drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 126/437
Matching with drivers in Linux: platform driver
drivers/i2c/busses/i2c-omap.c
static const struct of_device_id omap_i2c_of_match[] = {
{
.compatible = "ti,omap4-i2c",
.data = &omap4_pdata,
},
{
.compatible = "ti,omap3-i2c",
.data = &omap3_pdata,
},
[...]
{ },
};
MODULE_DEVICE_TABLE(of, omap_i2c_of_match);
[...]
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 127/437
Matching with drivers in Linux: I2C driver
sound/soc/codecs/cs42l51.c
const struct of_device_id cs42l51_of_match[] = {
{ .compatible = "cirrus,cs42l51", },
{ }
};
MODULE_DEVICE_TABLE(of, cs42l51_of_match);
sound/soc/codecs/cs42l51-i2c.c
static struct i2c_driver cs42l51_i2c_driver = {
.driver = {
.name = "cs42l51",
.of_match_table = cs42l51_of_match,
.pm = &cs42l51_pm_ops,
},
.probe = cs42l51_i2c_probe,
.remove = cs42l51_i2c_remove,
.id_table = cs42l51_i2c_id,
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 128/437
reg property
sai4: sai@50027000 {
reg = <0x50027000 0x4>, <0x500273f0 0x10>;
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 129/437
reg property
▶ Most important property after compatible
▶ Memory-mapped devices: base physical address and size of the memory-mapped
registers. Can have several entries for multiple register areas.
▶ I2C devices: address of the device on the I2C bus.
&i2c1 {
hdmi-transmitter@39 {
reg = <0x39>;
};
cs42l51: cs42l51@4a {
reg = <0x4a>;
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 129/437
reg property
▶ Most important property after compatible
▶ Memory-mapped devices: base physical address and size of the memory-mapped
registers. Can have several entries for multiple register areas.
▶ I2C devices: address of the device on the I2C bus.
▶ SPI devices: chip select number
&qspi {
flash0: mx66l51235l@0 {
reg = <0>;
};
flash1: mx66l51235l@1 {
reg = <1>;
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 129/437
reg property
sai4: sai@50027000 {
reg = <0x50027000 0x4>, <0x500273f0 0x10>;
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 129/437
cells property
▶ Property numbers shall fit into 32-bit containers called cells
▶ The compiler does not maintain information about the number of entries, the OS
just receives 4 independent cells
• Example with a reg property using 2 entries of 2 cells:
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 130/437
cells property
▶ Property numbers shall fit into 32-bit containers called cells
▶ The compiler does not maintain information about the number of entries, the OS
just receives 4 independent cells
▶ Need for other properties to declare the right formatting:
• #address-cells: Indicates the number of cells used to carry the address
• #size-cells: Indicates the dimension of the address range. 0: one address, 1:
address range (interval), 2: multiple address ranges.
▶ The parent-node declares the children reg property formatting
• Platform devices need memory ranges
module@a0000 {
#address-cells = <1>;
#size-cells = <1>;
serial@1000 {
reg = <0x1000 0x10>, <0x2000 0x10>;
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 130/437
cells property
▶ Property numbers shall fit into 32-bit containers called cells
▶ The compiler does not maintain information about the number of entries, the OS
just receives 4 independent cells
▶ Need for other properties to declare the right formatting:
• #address-cells: Indicates the number of cells used to carry the address
• #size-cells: Indicates the dimension of the address range. 0: one address, 1:
address range (interval), 2: multiple address ranges.
▶ The parent-node declares the children reg property formatting
• Platform devices need memory ranges
• SPI devices need chip-selects
spi@300000 {
#address-cells = <1>;
#size-cells = <0>;
flash@1 {
reg = <1>;
};
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 130/437
Status property
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 131/437
Resources: interrupts, clocks, DMA, reset lines, ...
intc: interrupt-controller@a0021000 {
compatible = "arm,cortex-a7-gic";
#interrupt-cells = <3>;
▶ Common pattern for resources shared interrupt-controller;
reg = <0xa0021000 0x1000>, <0xa0022000 0x2000>;
by multiple hardware blocks };
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 132/437
Generic suffixes
▶ xxx-gpios
• When drivers need access to GPIOs
• May be subsystem-specific or vendor-specific
• Examples: enable-gpios, cts-gpios, rts-gpios
▶ xxx-names
• Sometimes naming items is relevant
• Allows drivers to perform lookups by name rather than ID
• The order of definition of each item still matters
• Examples: gpio-names, clock-names, reset-names
uart0@4000c000 {
dmas = <&edma 26 0>, <&edma 27 0>;
dma-names = "tx", "rx";
...
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 133/437
How to validate Device Tree content? 1/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 134/437
How to validate Device Tree content? 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 135/437
Device Tree bindings
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 136/437
Device Tree binding: legacy style
Documentation/devicetree/bindings/i2c/i2c-omap.txt
i2c1: i2c@0 {
-Required properties : compatible = "ti,omap3-i2c";
- compatible : Must be #address-cells = <1>;
"ti,omap2420-i2c" for OMAP2420 SoCs #size-cells = <0>;
"ti,omap2430-i2c" for OMAP2430 SoCs ti,hwmods = "i2c1";
"ti,omap3-i2c" for OMAP3 SoCs clock-frequency = <400000>;
};
"ti,omap4-i2c" for OMAP4+ SoCs
"ti,am654-i2c", "ti,omap4-i2c" for AM654 SoCs
"ti,j721e-i2c", "ti,omap4-i2c" for J721E SoCs
"ti,am64-i2c", "ti,omap4-i2c" for AM64 SoCs
- ti,hwmods : Must be "i2c<n>", n being the instance number (1-based)
- #address-cells = <1>;
- #size-cells = <0>;
Recommended properties :
- clock-frequency : Desired I2C bus clock frequency in Hz. Otherwise
the default 100 kHz frequency will be used.
Optional properties:
- Child nodes conforming to i2c bus binding
Note: Current implementation will fetch base address, irq and dma
from omap hwmod data base during device registration.
Future plan is to migrate hwmod data base contents into device tree
blob so that, all the required data will be used from device tree dts
file.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 137/437
Device Tree binding: YAML style
Documentation/devicetree/bindings/i2c/ti,omap4-i2c.yaml
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 138/437
Validating Device Trees
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 139/437
Bindings syntax: base structure
Each YAML file defines one DT hierarchical level
(up to two when there are children nodes expected)
▶ %YAML defines the expected language version
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
--- ▶ $id maybe not a real URL, but a unique
$id: http://devicetree.org/schemas/<path>/<file-name.yaml>#
$schema: http://devicetree.org/meta-schemas/core.yaml# identifier
title: <Type and name of the device>
▶ $schema refers to the base meta-schema this
maintainers:
- John Doe <[email protected]> file should be validated against (in the Github
description: |
Some multiline text.
repository mentioned previously)
At an additional indentation level. ▶ properties: where the definitions start
# This line is a comment
properties:
▶ All possible properties should be listed
prop-a:
... • dash-separated lowercase names
prop-b: • names followed by a colon ’:’ and a new line
...
▶ Every indentation level is 2 spaces
▶ An empty line between property definitions
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 140/437
Bindings syntax: types
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 141/437
Bindings syntax: child nodes
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 142/437
Bindings syntax: expressing constraints
Besides defining precisely the different properties and their type, the content of the
property values must also be constrained.
▶ All properties can get an additional description parameter, which is only
readable by humans
▶ We try to maximize the constraints to minimize human errors
▶ One new line per constraint
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 143/437
Bindings syntax: numerical constraints
properties:
# The numerical value is bounded
# This is valid:
# frequency-hz = <100000>;
# frequency-hz = <0x40000>; /* 262144 Hz */
# This is not:
# frequency-hz = <0>;
▶ Example of constraints:
# frequency-hz = <&gpio 10>;
frequency-hz: • minimum:/maximum: min/max values for a
minimum: 10000
maximum: 400000 single value
default: 100000
• default: for a default value
# This is an array with either 1 or 2 members
# This is valid: • minItems:/maxItems: min/max number of
# cs-gpios = <&gpioA 1>;
# cs-gpios = <&gpioA 1>, <gpioA 5>; items in an array
# This is not:
# cs-gpios = <&gpioA 1>, <gpioA 5>, <gpioA 6>;
# cs-gpios = <50>;
cs-gpios:
minItems: 1
maxItems: 2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 144/437
Bindings syntax: lists and dictionaries
properties:
# This is a very common compatible definition ▶ Expressing several possible property values
# The only allowed combinations are (order matters):
# compatible = "vendor1,compat", "generic,compat"; (works with numbers and strings):
# compatible = "vendor2,compat", "generic,compat";
# compatible = "legacy-compat"; • Force a single expected value: const
compatible:
oneOf: • Allow taking one value from a list: enum
- items:
- enum: watch out the indentation: 2 spaces from
- vendor1,compat
- vendor2,compat the previous keyword and a dash
- const: generic,compat
- items:
- const: legacy-compat ▶ const/enum can be grouped within an items
# Property name is known by dt-schema, type will be inferred list, where each items sub-entry must be
# No need for minItems/maxItems, 2 will be implied from
# the main items list!
clocks:
observed
items:
- description: Interconnect ▶ We can build abstract conditional lists (eg. on
- description: External bus
top of items rather than proper values like
# This is valid: strength = <0>, <5>;
# This is invalid: strength = <0>; with const/enum:
# strength = <0>, <8>;
strength: • XOR using oneOf
$ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 2 • OR using anyOf
maxItems: 2
items: • AND using allOf
maximum: 5
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 145/437
Bindings syntax: referencing other bindings
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 146/437
Bindings syntax: altering on presence of properties
properties:
compatible:
▶ Sometimes more dynamic descriptions are
enum:
- compat1
needed
- compat2 • Dependencies between properties
prop-a: true
A property may be needed if there is another
prop-b: true
property
prop-c: true If both or none shall be present, the
dependencies: dependency should be expressed twice (in
prop-a: [ 'prop-b' ]
prop-b: [ 'prop-a' ] both directions)
allOf: • Changing constraints based on a property
- if:
properties:
compatible:
Can be expressed using if/else statements
contains: under the top-level allOf
const: compat1
then: Typical case: a compatible implies tweaking
properties:
prop-c: false a constraint
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 147/437
Bindings syntax: enforcing correct properties only
allOf:
- $ref: generic-file.yaml
▶ YAML files list properties and add constraints
properties:
to them
prop-a: true • It is still possible to add undefined properties
prop-b: true • It is still possible to forget defining a
child-node:
type: object mandatory property
▶ We need further constraints to spot typos and
properties:
prop-c: true
prop-d: true
required:
unexpected properties
- prop-c • required forces the presence
# No additional property than the ones above
# will be allowed inside child-node
• additionalProperties prevents any property
additionalProperties: false
not defined in this file to be used
required:
- prop-a
• unevaluatedProperties prevents any
# Only properties defined below or coming from
property not defined in this file nor referenced
# generic-file.yaml will be allowed
unevaluatedProperties: false
(through allOf or $ref) to be used
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 148/437
Bindings syntax: validating your own bindings
properties:
prop-a: true
prop-b: true
child-node:
type: object
additionalProperties: false
▶ It is very recommended to test your bindings
required:
before testing your DTS
- prop-a • Add examples at the end of your file!
unevaluatedProperties: false • Examples are indented with 4 spaces
example:
- |
node@1000 {
prop-a;
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 149/437
References
▶ Device Tree 101 webinar, Thomas Petazzoni
(2021):
Slides: https://bootlin.com/blog/device-
tree-101-webinar-slides-and-videos/
Video: https://youtu.be/a9CZ1Uk3OYQ
▶ Kernel documentation
• driver-api/driver-model/
• devicetree/
• filesystems/sysfs
▶ https://devicetree.org
▶ The kernel source code
• Full of examples of other drivers!
• Reference DT binding implementation:
Documentation/devicetree/bindings/
example-schema.yaml
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 150/437
Practical lab - Describing hardware devices
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 151/437
Introduction to pin muxing
Introduction to pin
muxing
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 152/437
What is pin muxing?
▶ Modern SoCs (System on Chip) include more and more hardware blocks, many of
which need to interface with the outside world using pins.
▶ However, the physical size of the chips remains small, and therefore the number of
available pins is limited.
▶ For this reason, not all of the internal hardware block features can be exposed on
the pins simultaneously.
▶ The pins are multiplexed: they expose either the functionality of hardware block
A or the functionality of hardware block B.
▶ This multiplexing is usually software configurable.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 153/437
Pin muxing diagram
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 154/437
Pin muxing in the Linux kernel
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 155/437
pinctrl subsystem diagram
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 156/437
Device Tree properties for consumer devices
The devices that require certains pins to be muxed will use the pinctrl-<x> and
pinctrl-names Device Tree properties.
▶ The pinctrl-0, pinctrl-1, pinctrl-<x> properties link to a pin configuration
for a given state of the device.
▶ The pinctrl-names property associates a name to each state. The name
default is special, and is automatically selected by a device driver, without
having to make an explicit pinctrl function call.
▶ See Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt for
details.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 157/437
Device Tree properties for consumer devices - Examples
i2c0: i2c@f8014000 {
i2c0: i2c@11000 {
...
...
pinctrl-names = "default", "gpio";
pinctrl-0 = <&pmx_twsi0>;
pinctrl-0 = <&pinctrl_i2c0>;
pinctrl-names = "default";
pinctrl-1 = <&pinctrl_i2c0_gpio>;
...
...
};
};
Most common case (arch/arm/boot/dts/
Case with multiple pin states (arch/arm/
marvell/kirkwood.dtsi)
boot/dts/microchip/sama5d4.dtsi)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 158/437
Defining pinctrl configurations
▶ The different pinctrl configurations must be defined as child nodes of the main
pinctrl device (which controls the muxing of pins).
▶ The configurations may be defined at:
• the SoC level (.dtsi file), for pin configurations that are often shared between
multiple boards
• at the board level (.dts file) for configurations that are board specific.
▶ The pinctrl-<x> property of the consumer device points to the pin configuration
it needs through a DT phandle.
▶ The description of the configurations is specific to each pinctrl driver. See
Documentation/devicetree/bindings/pinctrl for the pinctrl bindings.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 159/437
Example on OMAP/AM33xx
/* Excerpt from am335x-bone-common.dts */
&am33xx_pinmux {
...
i2c2_pins: pinmux_i2c2_pins {
▶ On OMAP/AM33xx, the pinctrl-single pinctrl-single,pins = <
AM33XX_PADCONF(AM335X_PIN_UART1_CTSN, PIN_INPUT_PULLUP, MU
driver is used. It is common between multiple /* uart1_ctsn.i2c2_sda */
AM33XX_PADCONF(AM335X_PIN_UART1_RTSN, PIN_INPUT_PULLUP, MU
SoCs and simply allows to configure pins by /* uart1_rtsn.i2c2_scl */
>;
writing a value to a register. };
• In each pin configuration, a };
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 160/437
Example on the Allwinner A20 SoC
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 161/437
Illustration: live pin muxing configuration
Viewing pin assignments
on the PCB
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 163/437
Linux device and driver model
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 164/437
Linux device and driver model
Introduction
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 165/437
The need for a device model?
▶ The Linux kernel runs on a wide range of architectures and hardware platforms,
and therefore needs to maximize the reusability of code between platforms.
▶ For example, we want the same USB device driver to be usable on a x86 PC, or
an ARM platform, even though the USB controllers used on these platforms are
different.
▶ This requires a clean organization of the code, with the device drivers separated
from the controller drivers, the hardware description separated from the drivers
themselves, etc.
▶ This is what the Linux kernel Device Model allows, in addition to other
advantages covered in this section.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 166/437
Kernel and device drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 167/437
Device model data structures
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 168/437
Bus drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 169/437
sysfs
▶ The bus, device, drivers, etc. structures are internal to the kernel
▶ The sysfs virtual filesystem offers a mechanism to export such information to
user space
▶ Used for example by udev to provide automatic module loading, firmware loading,
mounting of external media, etc.
▶ sysfs is usually mounted in /sys
• /sys/bus/ contains the list of buses
• /sys/devices/ contains the list of devices
• /sys/class enumerates devices by the framework they are registered to (net,
input, block...), whatever bus they are connected to. Very useful!
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 170/437
Linux device and driver model
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 171/437
Example: USB bus 1/3
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 172/437
Example: USB bus 2/3
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 173/437
Example: USB bus 3/3
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 174/437
Example of device driver
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 175/437
Device identifiers
▶ Defines the set of devices that this driver can manage, so that the USB core
knows for which devices this driver should be used
▶ The MODULE_DEVICE_TABLE() macro allows depmod (run by
make modules_install) to extract the relationship between device identifiers and
drivers, so that drivers can be loaded automatically by udev. See
/lib/modules/$(uname -r)/modules.{alias,usbmap}
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 176/437
Instantiation of usb_driver
▶ struct usb_driver is a structure defined by the USB core. Each USB device
driver must instantiate it, and register itself to the USB core using this structure
▶ This structure inherits from struct device_driver, which is defined by the
device model.
static struct usb_driver rtl8150_driver = {
.name = "rtl8150",
.probe = rtl8150_probe,
.disconnect = rtl8150_disconnect,
.id_table = rtl8150_table,
.suspend = rtl8150_suspend,
.resume = rtl8150_resume
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 177/437
Driver registration and unregistration
▶ When the driver is loaded / unloaded, it must register / unregister itself to / from the
USB core
▶ Done using usb_register() and usb_deregister(), provided by the USB core.
module_init(usb_rtl8150_init);
module_exit(usb_rtl8150_exit);
module_usb_driver(rtl8150_driver);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 178/437
At Initialization
▶ The USB adapter driver that corresponds to the USB controller of the system
registers itself to the USB core
▶ The rtl8150 USB device driver registers itself to the USB core
▶ The USB core now knows the association between the vendor/product IDs of
rtl8150 and the struct usb_driver structure of this driver
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 179/437
When a device is detected
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 180/437
Probe method
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 181/437
Example: probe() and disconnect() methods
static int rtl8150_probe(struct usb_interface *intf, static void rtl8150_disconnect(struct usb_interface *intf)
const struct usb_device_id *id) {
{ rtl8150_t *dev = usb_get_intfdata(intf);
rtl8150_t *dev;
struct net_device *netdev; usb_set_intfdata(intf, NULL);
if (dev) {
netdev = alloc_etherdev(sizeof(rtl8150_t)); set_bit(RTL8150_UNPLUG, &dev->flags);
[...] tasklet_kill(&dev->tl);
dev = netdev_priv(netdev); unregister_netdev(dev->netdev);
tasklet_init(&dev->tl, rx_fixup, (unsigned long)dev); unlink_all_urbs(dev);
spin_lock_init(&dev->rx_pool_lock); free_all_urbs(dev);
[...] free_skb_pool(dev);
netdev->netdev_ops = &rtl8150_netdev_ops; if (dev->rx_skb)
alloc_all_urbs(dev); dev_kfree_skb(dev->rx_skb);
[...] kfree(dev->intr_buff);
usb_set_intfdata(intf, dev); free_netdev(dev->netdev);
SET_NETDEV_DEV(netdev, &intf->dev); }
register_netdev(netdev); }
return 0;
}
Source: drivers/net/usb/rtl8150.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 182/437
The model is recursive
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 183/437
Linux device and driver model
Platform drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 184/437
Platform devices
▶ Amongst the non-discoverable devices, a huge family are the devices that are
directly part of a system-on-chip: UART controllers, Ethernet controllers, SPI or
I2C controllers, graphic or audio devices, etc.
▶ In the Linux kernel, a special bus, called the platform bus has been created to
handle such devices.
▶ It supports platform drivers that handle platform devices.
▶ It works like any other bus (USB, PCI), except that devices are enumerated
statically instead of being discovered dynamically.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 185/437
Implementation of a platform driver (1)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 186/437
Implementation of a platform driver (2)
module_init(imx_serial_init);
module_exit(imx_serial_cleanup);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 187/437
Platform device instantiation
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 188/437
Using additional hardware resources
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 189/437
Using resources
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 190/437
Driver data
▶ In addition to the per-device resources and information, drivers may require
driver-specific information to behave slighlty differently when different flavors of
an IP block are driven by the same driver.
▶ A const void *data pointer can be used to store per-compatible specificities:
static const struct of_device_id marvell_nfc_of_ids[] = {
{
.compatible = "marvell,armada-8k-nand-controller",
.data = &marvell_armada_8k_nfc_caps,
},
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 192/437
What is I2C?
▶ A very commonly used low-speed bus to connect on-board and external devices to
the processor.
▶ Uses only two wires: SDA for the data, SCL for the clock.
▶ It is a master/slave bus: only the master can initiate transactions, and slaves can
only reply to transactions initiated by masters.
▶ In a Linux system, the I2C controller embedded in the processor is typically the
master, controlling the bus.
▶ Each slave device is identified by an I2C address (you can’t have 2 devices with
the same address on the same bus). Each transaction initiated by the master
contains this address, which allows the relevant slave to recognize that it should
reply to this particular transaction.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 193/437
An I2C bus example
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 194/437
The I2C bus driver
▶ Like all bus subsystems, the I2C bus driver is responsible for:
• Providing an API to implement I2C controller drivers
• Providing an API to implement I2C device drivers, in kernel space
• Providing an API to implement I2C device drivers, in user space
▶ The core of the I2C bus driver is located in drivers/i2c/.
▶ The I2C controller drivers are located in drivers/i2c/busses/.
▶ The I2C device drivers are located throughout drivers/, depending on the
framework used to expose the devices (e.g. drivers/input/ for input devices).
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 195/437
Registering an I2C device driver
▶ Like all bus subsystems, the I2C subsystem defines a struct i2c_driver that
inherits from struct device_driver, and which must be instantiated and
registered by each I2C device driver.
• As usual, this structure points to the ->probe() and ->remove() functions.
• It also contains a legacy id_table, used for non-DT based probing of I2C devices.
▶ The i2c_add_driver() and i2c_del_driver() functions are used to
register/unregister the driver.
▶ If the driver doesn’t do anything else in its init()/exit() functions, it is advised
to use the module_i2c_driver() macro instead.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 196/437
Registering an I2C device driver: example
static const struct i2c_device_id adxl345_i2c_id[] = {
{ "adxl345", ADXL345 },
{ "adxl375", ADXL375 },
{ }
};
MODULE_DEVICE_TABLE(i2c, adxl345_i2c_id);
MODULE_DEVICE_TABLE(of, adxl345_of_match);
module_i2c_driver(adxl345_i2c_driver);
From drivers/iio/accel/adxl345_i2c.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 197/437
Registering an I2C device: non-DT
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 198/437
Registering an I2C device, non-DT example
...
i2c_register_board_info(0, em7210_i2c_devices,
ARRAY_SIZE(em7210_i2c_devices));
}
From arch/arm/mach-iop32x/em7210.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 199/437
Registering an I2C device, in the DT
▶ In the Device Tree, the I2C controller device is typically defined in the .dtsi file
that describes the processor.
• Normally defined with status = "disabled".
▶ At the board/platform level:
• the I2C controller device is enabled (status = "okay")
• the I2C bus frequency is defined, using the clock-frequency property.
• the I2C devices on the bus are described as children of the I2C controller node,
where the reg property gives the I2C slave address on the bus.
▶ See the binding for the corresponding driver for a specification of the expected DT
properties. Example: Documentation/devicetree/bindings/i2c/i2c-omap.txt
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 200/437
Registering an I2C device, DT example (1/2)
From arch/arm/boot/dts/allwinner/sun7i-a20.dtsi
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 201/437
Registering an I2C device, DT example (2/2)
axp209: pmic@34 {
compatible = "x-powers,axp209";
reg = <0x34>;
interrupt-parent = <&nmi_intc>;
interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
interrupt-controller;
#interrupt-cells = <1>;
};
};
From arch/arm/boot/dts/allwinner/sun7i-a20-olinuxino-micro.dts
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 202/437
probe() and remove()
▶ The ->probe() function is responsible for initializing the device and registering it
in the appropriate kernel framework. It receives as argument:
• An struct i2c_client pointer, which represents the I2C device itself. This
structure inherits from struct device.
• On older kernels (< v6.4), ->probe() was taking a second (unused) argument, the
removal of this other argument implied the use of another probe function for some
kernel releases, called ->probe_new().
▶ The ->remove() function is responsible for unregistering the device from the
kernel framework and shut it down. It receives as argument:
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 203/437
Probe example
From drivers/iio/accel/da311.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 204/437
Remove example
From drivers/iio/accel/da311.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 205/437
Communicating with the I2C device: raw API
The most basic API to communicate with the I2C device provides functions to either
send or receive data:
▶ Send a buf to the I2C device with:
int i2c_master_send(const struct i2c_client *client, const char *buf, int count);
▶ Receive a count bytes from the I2C device and save them in buf with:
Both functions return a negative error number in case of failure, otherwise the number
of transmitted bytes.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 206/437
Communicating with the I2C device: message transfer
The message transfer API allows to describe transfers that consists of several
messages, with each message being a transaction in one direction:
int i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 207/437
I2C: message transfer example
msg[1].addr = ts->client->addr;
msg[1].flags = I2C_M_RD;
msg[1].len = ts->read_buf_len;
msg[1].buf = buf;
From drivers/input/touchscreen/st1232.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 208/437
SMBus calls
▶ SMBus is a subset of the I2C protocol.
▶ It defines a standard set of transactions, such as reading/writing from a
register-like interface.
▶ Linux provides SMBus functions that should preferably be used instead of the raw
API with devices supporting SMBus.
▶ Such a driver will be usable with both SMBus and I2C adapters
• SMBus adapters cannot send raw I2C commands
• I2C adapters will receive an SMBus-like command crafted by the core
▶ Example: the i2c_smbus_read_byte_data() function allows to read one byte of
data from a device “register”.
• It does the following operations:
S Addr Wr [A] Comm [A] Sr Addr Rd [A] [Data] NA P
• Which means it first writes a one byte data command (Comm, which is the
“register” address), and then reads back one byte of data ([Data]).
▶ See i2c/smbus-protocol for details.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 209/437
List of SMBus functions
▶ Write a command byte, and read or write a block of data (max 32 bytes)
• s32 i2c_smbus_read_block_data(const struct i2c_client *client, u8 command, u8 *values);
• s32 i2c_smbus_write_block_data(const struct i2c_client *client, u8 command, u8 length, const u8 *values);
▶ Write a command byte, and read or write a block of data (no limit)
• s32 i2c_smbus_read_i2c_block_data(const struct i2c_client *client, u8 command, u8 length, u8 *values);
• s32 i2c_smbus_write_i2c_block_data(const struct i2c_client *client, u8 command, u8 length, const u8 *values);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 210/437
I2C functionality
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 211/437
References
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 212/437
Practical lab - Communicate with the Nunchuk
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 213/437
Kernel frameworks for device drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 214/437
Kernel and Device Drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 215/437
Kernel frameworks for device drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 216/437
Types of devices
Under Linux, there are essentially four types of devices:
▶ Network devices. They are represented as network interfaces, visible in user
space using ip a
▶ Block devices. They are used to provide user space applications access to raw
storage devices (hard disks, USB keys). They are visible to the applications as
device files in /dev.
▶ Character devices. They are used to provide user space applications access to all
other types of devices (input, sound, graphics, serial, etc.). They are also visible
to the applications as device files in /dev.
▶ Sysfs devices. They don’t have any of the above user space interfaces, only a
representation in sysfs. ”Internal” device drivers fall under this (e.g. pinctrl), but
also some user-space accessible devices. E.g. gpio (deprecated), IIO (Industrial
I/O).
→ Most devices are character devices, so we will study these in more details.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 217/437
Major and minor numbers
▶ Within the kernel, all block and character devices are identified using a major and
a minor number.
▶ The major number typically indicates the family of the device.
▶ The minor number allows drivers to distinguish the various devices they manage.
▶ Some major numbers are statically allocated, and identical across all Linux
systems.
▶ Since approximately 2016, new frameworks use dynamically allocated major
numbers when possible.
▶ Minor numbers are almost always (partially) dynamically allocated by the
framework itself. Allocation happens in order of enumeration of the devices.
▶ Pre-defined values, ranges and rules can be found in admin-guide/devices.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 218/437
Devices: everything is a file
▶ A very important UNIX design decision was to represent most system objects as
files
▶ It allows applications to manipulate all system objects with the normal file API
(open, read, write, close, etc.)
▶ So, devices had to be represented as files to the applications
▶ This is done through a special artifact called a device file
▶ It is a special type of file, that associates a file name visible to user space
applications to the triplet (type, major, minor) that the kernel understands
▶ All device files are by convention stored in the /dev directory
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 219/437
Device files examples
Example of device files in a Linux system
$ ls -l /dev/ttyS0 /dev/tty1 /dev/sda /dev/sda1 /dev/sda2 /dev/sdc1 /dev/zero
brw-rw---- 1 root disk 8, 0 2011-05-27 08:56 /dev/sda
brw-rw---- 1 root disk 8, 1 2011-05-27 08:56 /dev/sda1
brw-rw---- 1 root disk 8, 2 2011-05-27 08:56 /dev/sda2
brw-rw---- 1 root disk 8, 32 2011-05-27 08:56 /dev/sdc
crw------- 1 root root 4, 1 2011-05-27 08:57 /dev/tty1
crw-rw---- 1 root dialout 4, 64 2011-05-27 08:56 /dev/ttyS0
crw-rw-rw- 1 root root 1, 5 2011-05-27 08:56 /dev/zero
Example C code that uses the usual file API to write data to a serial port
int fd;
fd = open("/dev/ttyS0", O_RDWR);
write(fd, "Hello", 5);
close(fd);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 220/437
Creating device files
▶ Before Linux 2.6.32, on basic Linux systems, the device files had to be created
manually using the mknod command
• mknod /dev/<device> [c|b] major minor
• Needed root privileges
• Coherency between device files and devices handled by the kernel was left to the
system developer
▶ The devtmpfs virtual filesystem can be mounted on /dev and contains all the
devices registered to kernel frameworks. The CONFIG_DEVTMPFS_MOUNT kernel
configuration option makes the kernel mount it automatically at boot time, except
when booting on an initramfs.
▶ devtmpfs can be supplemented by userspace tools like udev or mdev to adjust
permission/ownership, load kernel modules automatically and create symbolic
links to devices.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 221/437
Kernel frameworks for device drivers
Character drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 222/437
A character driver in the kernel
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 223/437
From user space to the kernel: character devices
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 224/437
File operations
Here are the most important operations for a character driver, from the definition of
struct file_operations:
struct file_operations {
struct module *owner;
ssize_t (*read) (struct file *, char __user *,
size_t, loff_t *);
ssize_t (*write) (struct file *, const char __user *,
size_t, loff_t *);
long (*unlocked_ioctl) (struct file *, unsigned int,
unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
int (*open) (struct inode *, struct file *);
int (*release) (struct inode *, struct file *);
...
};
Many operations exist, they are all optional.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 225/437
open() and release()
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 226/437
read() and write()
▶ ssize_t foo_read(struct file *f, char __user *buf, size_t sz, loff_t *off)
• Called when user space uses the read() system call on the device.
• Must read data from the device, write at most sz bytes to the user space buffer buf,
and update the current position in the file off. f is a pointer to the same file
structure that was passed in the open() operation
• Must return the number of bytes read.
0 is usually interpreted by userspace as the end of the file.
• On UNIX, read() operations typically block when there isn’t enough data to read
from the device
▶ ssize_t foo_write(struct file *f, const char __user *buf, size_t sz, loff_t *off)
• Called when user space uses the write() system call on the device
• The opposite of read, must read at most sz bytes from buf, write it to the device,
update off and return the number of bytes written.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 227/437
Exchanging data with user space 1/3
▶ Kernel code isn’t allowed to directly access user space memory, using memcpy() or
direct pointer dereferencing
• User pointer dereferencing is disabled by default to make it harder to exploit
vulnerabilities.
• If the address passed by the application was invalid, the kernel could segfault.
• Never trust user space. A malicious application could pass a kernel address which
you could overwrite with device data (read case), or which you could dump to the
device (write case).
• Doing so does not work on some architectures anyway.
▶ To keep the kernel code portable, secure, and have proper error handling, your
driver must use special kernel functions to exchange data with user space.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 228/437
Exchanging data with user space 2/3
▶ A single value
• get_user(v, p);
The kernel variable v gets the value pointed by the user space pointer p
• put_user(v, p);
The value pointed by the user space pointer p is set to the contents of the kernel
variable v.
▶ A buffer
• unsigned long copy_to_user(void __user *to, const void *from,
unsigned long n);
• unsigned long copy_from_user(void *to, const void __user *from,
unsigned long n);
▶ The return value must be checked. Zero on success, non-zero on failure. If
non-zero, the convention is to return -EFAULT.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 229/437
Exchanging data with user space 3/3
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 230/437
Zero copy access to user memory
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 231/437
unlocked_ioctl()
▶ long unlocked_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
• Associated to the ioctl() system call.
• Called unlocked because it didn’t hold the Big Kernel Lock (gone now).
• Allows to extend the driver capabilities beyond the limited read/write API.
• For example: changing the speed of a serial port, setting video output format,
querying a device serial number... Used extensively in the V4L2 (video) and ALSA
(sound) driver frameworks.
• cmd is a number identifying the operation to perform.
See driver-api/ioctl for the recommended way of choosing cmd numbers.
• arg is the optional argument passed as third argument of the ioctl() system call.
Can be an integer, an address, etc.
• The semantic of cmd and arg is driver-specific.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 232/437
ioctl() example: kernel side
#include <linux/phantom.h>
switch (cmd) {
case PHN_SET_REG:
if (copy_from_user(&r, argp, sizeof(r)))
return -EFAULT;
/* Do something */
break;
...
case PHN_GET_REG:
if (copy_to_user(argp, &r, sizeof(r)))
return -EFAULT;
/* Do something */
break;
...
default:
return -ENOTTY;
}
return 0;
}
int main(void)
{
int fd, ret;
struct phm_reg reg;
fd = open("/dev/phantom");
assert(fd > 0);
reg.field1 = 42;
reg.field2 = 67;
return 0;
}
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 234/437
Kernel frameworks for device drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 235/437
Beyond character drivers: kernel frameworks
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 236/437
Example: Some Kernel Frameworks
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 237/437
Example: Framebuffer Framework
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 238/437
Framebuffer driver operations
Here are the operations a framebuffer driver can or must implement, and define them in a
struct fb_ops structure (excerpt from drivers/video/fbdev/skeletonfb.c)
static struct fb_ops xxxfb_ops = {
.owner = THIS_MODULE,
.fb_open = xxxfb_open,
.fb_read = xxxfb_read,
.fb_write = xxxfb_write,
.fb_release = xxxfb_release,
.fb_check_var = xxxfb_check_var,
.fb_set_par = xxxfb_set_par,
.fb_setcolreg = xxxfb_setcolreg,
.fb_blank = xxxfb_blank,
.fb_pan_display = xxxfb_pan_display,
.fb_fillrect = xxxfb_fillrect, /* Needed !!! */
.fb_copyarea = xxxfb_copyarea, /* Needed !!! */
.fb_imageblit = xxxfb_imageblit, /* Needed !!! */
.fb_cursor = xxxfb_cursor, /* Optional !!! */
.fb_rotate = xxxfb_rotate,
.fb_sync = xxxfb_sync,
.fb_ioctl = xxxfb_ioctl,
.fb_mmap = xxxfb_mmap,
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 239/437
Framebuffer driver code
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 240/437
Kernel frameworks for device drivers
Device-managed allocations
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 241/437
Device managed allocations
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 242/437
Device managed allocations: memory allocation example
▶ Normally done with kmalloc(size_t, gfp_t), released with kfree(void *)
▶ Device managed with devm_kmalloc(struct device *, size_t, gfp_t)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 243/437
Device managed allocations caveats
▶ Cleanup is done when the struct device is cleaned up. There is no reference
counting or anything like that.
▶ Don’t use if the allocated memory is used outside of the device node. E.g. if the
userspace device file is still open after remove.
▶ Be very careful when there are circular references.
▶ ”Why is devm_kzalloc() harmful and what can we do about it”, Laurent
Pinchart, LPC 2022
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 244/437
Kernel frameworks for device drivers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 245/437
Driver-specific Data Structure
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 246/437
Driver-specific Data Structure Examples 1/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 247/437
Driver-specific Data Structure Examples 2/2
▶ ds1305 RTC driver: struct ds1305 has a reference to struct rtc_device
struct ds1305 {
struct spi_device *spi;
struct rtc_device *rtc;
[...]
};
▶ The framework structure typically contains a struct device * pointer that the
driver must point to the corresponding struct device
• It’s the relationship between the logical device (for example a network interface) and
the physical device (for example the USB network adapter)
▶ The device structure also contains a void * pointer that the driver can freely use.
• It’s often used to link back the device to the higher-level structure from the
framework.
• It allows, for example, from the struct platform_device structure, to find the
structure describing the logical device
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 249/437
Links between structures 2/4
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 250/437
Links between structures 3/4
[...]
[...]
ds1305->rtc = devm_rtc_allocate_device(&spi->dev);
// Arrows 3 and 4
[...]
}
[...]
}
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 251/437
Links between structures 4/4
static int rtl8150_probe(struct usb_interface *intf,
const struct usb_device_id *id)
{
struct usb_device *udev = interface_to_usbdev(intf);
rtl8150_t *dev;
struct net_device *netdev;
netdev = alloc_etherdev(sizeof(rtl8150_t));
dev = netdev_priv(netdev);
[...]
[...]
[...]
}
[...]
}
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 252/437
The input subsystem
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 253/437
What is the input subsystem?
▶ The input subsystem takes care of all the input events coming from the human
user.
▶ Initially written to support the USB HID (Human Interface Device) devices, it
quickly grew up to handle all kinds of inputs (using USB or not): keyboards, mice,
joysticks, touchscreens, etc.
▶ The input subsystem is split in two parts:
• Device drivers: they talk to the hardware (for example via USB), and provide
events (keystrokes, mouse movements, touchscreen coordinates) to the input core
• Event handlers: they get events from drivers and pass them where needed via
various interfaces (most of the time through evdev)
▶ In user space it is usually used by the graphic stack such as X.Org, Wayland or
Android’s InputManager.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 254/437
Input subsystem diagram
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 255/437
Input subsystem overview
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 256/437
Input subsystem API 1/3
An input device is described by a very long struct input_dev structure, an excerpt is:
struct input_dev {
const char *name;
[...]
struct input_id id;
[...]
unsigned long evbit[BITS_TO_LONGS(EV_CNT)];
unsigned long keybit[BITS_TO_LONGS(KEY_CNT)];
[...]
int (*getkeycode)(struct input_dev *dev,
struct input_keymap_entry *ke);
[...]
int (*open)(struct input_dev *dev);
[...]
int (*event)(struct input_dev *dev, unsigned int type,
unsigned int code, int value);
[...]
};
Before being used, this structure must be allocated and initialized, typically with:
struct input_dev *devm_input_allocate_device(struct device *dev);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 257/437
Input subsystem API 2/3
▶ Depending on the type of events that will be generated, the input bit fields evbit
and keybit must be configured: For example, for a button we only generate
EV_KEY type events, and from these only BTN_0 events code:
set_bit(EV_KEY, myinput_dev.evbit);
set_bit(BTN_0, myinput_dev.keybit);
▶ Once the input device is allocated and filled, the function to register it is:
int input_register_device(struct input_dev *);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 258/437
Input subsystem API 3/3
The events are sent by the driver to the event handler using
void input_event(struct input_dev *dev, unsigned int type, unsigned int code, int value)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 259/437
Example from drivers/hid/usbhid/usbmouse.c
input_sync(dev);
...
}
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 260/437
Polling input devices
▶ The input subsystem provides an API to support simple input devices that do not
raise interrupts but have to be periodically scanned or polled to detect changes in
their state.
▶ Setting up polling is done using input_setup_polling():
int input_setup_polling(struct input_dev *dev, void (*poll_fn)(struct input_dev *dev));
▶ poll_fn is the function that will be called periodically.
▶ The polling interval can be set using input_set_poll_interval() or
input_set_min_poll_interval() and input_set_max_poll_interval()
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 261/437
evdev user space interface
▶ The main user space interface to input devices is the event interface
▶ Each input device is represented as a /dev/input/event<X> character device
▶ A user space application can use blocking and non-blocking reads, but also
select() (to get notified of events) after opening this device.
▶ Each read will return struct input_event structures of the following format:
struct input_event {
struct timeval time;
unsigned short type;
unsigned short code;
unsigned int value;
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 262/437
Practical lab - Expose the Nunchuk to user space
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 263/437
Memory Management
Memory Management
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 264/437
Physical and virtual memory
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 265/437
Virtual memory organization
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 266/437
Physical/virtual memory mapping on 32-bit systems
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 267/437
32-bit systems limitations
▶ Only less than 1GB memory addressable directly through kernel virtual addresses
▶ If more physical memory is present on the platform, part of the memory will not
be accessible by kernel space, but can be used by user space
▶ To allow the kernel to access more physical memory:
• Change the 3GB/1GB memory split to 2GB/2GB or 1GB/3GB (CONFIG_VMSPLIT_2G
or CONFIG_VMSPLIT_1G) ⇒ reduce total user memory available for each process
• Activate highmem support if available for your architecture:
Allows kernel to map parts of its non-directly accessible memory
Mapping must be requested explicitly
Limited addresses ranges reserved for this usage
▶ See Arnd Bergmann’s 4GB by 4GB split presentation (video and slides) at Linaro
Connect virtual 2020.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 268/437
Physical/virtual memory mapping on 64-bit systems (4kiB-pages)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 269/437
User space virtual address space
▶ When a process
starts, the executable
code is loaded in
RAM and mapped
into the process
virtual address space.
▶ During execution,
additional mappings
can be created:
• Memory
allocations
• Memory mapped
files
• mmap’ed areas
• ...
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 270/437
Userspace memory allocations
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 271/437
Kernel memory allocators
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 272/437
Page allocator
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 273/437
Page allocator API
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 274/437
Page allocator flags
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 275/437
SLAB allocator 1/2
▶ The SLAB allocator allows to create caches, which contain a set of objects of the
same size. In English, slab means tile.
▶ The object size can be smaller or greater than the page size
▶ The SLAB allocator takes care of growing or reducing the size of the cache as
needed, depending on the number of allocated objects. It uses the page allocator
to allocate and free pages.
▶ SLAB caches are used for data structures that are present in many instances in
the kernel: directory entries, file objects, network packet descriptors, process
descriptors, etc.
• See /proc/slabinfo
▶ They are rarely used for individual drivers.
▶ See include/linux/slab.h for the API
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 276/437
SLAB allocator 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 277/437
Different SLAB allocators
There are different, but API compatible, implementations of a SLAB allocator in the Linux
kernel. A particular implementation is chosen at configuration time.
▶ CONFIG_SLAB: legacy but now deprecated
▶ CONFIG_SLUB: the default allocator, scaling better and creating less fragmentation than
previous implementations.
▶ CONFIG_SLUB_TINY: configure SLUB to achieve minimal memory footprint, sacrificing
scalability, debugging and other features. Not recommended for systems with more than
16 MB of RAM.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 278/437
kmalloc allocator
▶ The kmalloc allocator is the general purpose memory allocator in the Linux kernel
▶ For small sizes, it relies on generic SLAB caches, named kmalloc-XXX in
/proc/slabinfo
▶ For larger sizes, it relies on the page allocator
▶ The allocated area is guaranteed to be physically contiguous
▶ The allocated area size is rounded up to the size of the smallest SLAB cache in
which it can fit (while using the SLAB allocator directly allows to have more
flexibility)
▶ It uses the same flags as the page allocator (GFP_KERNEL, GFP_ATOMIC, etc.) with
the same semantics.
▶ Maximum sizes, on x86 and arm (see https://j.mp/YIGq6W):
- Per allocation: 4 MB
- Total allocations: 128 MB
▶ Should be used as the primary allocator unless there is a strong reason to use
another one.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 279/437
kmalloc API 1/2
▶ #include <linux/slab.h>
▶ void *kmalloc(size_t size, gfp_t flags);
• Allocate size bytes, and return a pointer to the area (virtual address)
• size: number of bytes to allocate
• flags: same flags as the page allocator
▶ void kfree(const void *objp);
• Free an allocated area
▶ Example: (drivers/infiniband/core/cache.c)
struct ib_port_attr *tprops;
tprops = kmalloc(sizeof *tprops, GFP_KERNEL);
...
kfree(tprops);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 280/437
kmalloc API 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 281/437
devm_kmalloc functions
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 282/437
vmalloc allocator
▶ The vmalloc() allocator can be used to obtain memory zones that are contiguous
in the virtual addressing space, but not made out of physically contiguous pages.
▶ The requested memory size is rounded up to the next page (not efficient for small
allocations).
▶ The allocated area is in the kernel space part of the address space, but outside of
the identically-mapped area
▶ Allocations of fairly large areas is possible (almost as big as total available
memory, see https://j.mp/YIGq6W again), since physical memory fragmentation
is not an issue.
▶ Not suitable for DMA purposes.
▶ API in include/linux/vmalloc.h
• void *vmalloc(unsigned long size);
Returns a virtual address
• void vfree(void *addr);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 283/437
Kernel memory debugging
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 284/437
Kernel memory management: resources
Virtual memory and Linux, Alan Ott and Matt Porter, 2016
Great and much more complete presentation about this topic
https://bit.ly/2Af1G2i (video: https://bit.ly/2Bwwv0C)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 285/437
I/O Memory
I/O Memory
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 286/437
Memory-Mapped I/O
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 287/437
Requesting I/O memory
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 288/437
/proc/iomem example - ARM 32 bit (BeagleBone Black, Linux 5.11)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 289/437
Mapping I/O memory in virtual memory
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 290/437
ioremap()
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 292/437
Accessing MMIO devices: using accessor functions
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 293/437
MMIO access functions
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 294/437
MMIO access functions summary
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 295/437
Ordering
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 296/437
/dev/mem
▶ Used to provide user space applications with direct access to physical addresses.
▶ Usage: open /dev/mem and read or write at given offset. What you read or write
is the value at the corresponding physical address.
▶ Used by applications such as the X server to write directly to device memory.
▶ Easy to use from a shell with the devmem2 program
▶ For security reasons, on x86, arm, arm64, riscv, powerpc, parisc, s390:
• CONFIG_STRICT_DEVMEM restricts /dev/mem to non-RAM addresses (from v5.12)
• CONFIG_IO_STRICT_DEVMEM goes beyond and only allows to access idle I/O ranges
(not appearing in /proc/iomem).
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 297/437
Practical lab - I/O memory and ports
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 298/437
The misc subsystem
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 299/437
Why a misc subsystem?
▶ The kernel offers a large number of frameworks covering a wide range of device
types: input, network, video, audio, etc.
• These frameworks allow to factorize common functionality between drivers and offer
a consistent API to user space applications.
▶ However, there are some devices that really do not fit in any of the existing
frameworks.
• Highly customized devices implemented in a FPGA, or other weird devices for which
implementing a complete framework is not useful.
▶ The drivers for such devices could be implemented directly as raw character
drivers (with cdev_init() and cdev_add()).
▶ But there is a subsystem that makes this work a little bit easier: the misc
subsystem.
• It is really only a thin layer above the character driver API.
• Another advantage is that devices are integrated in the Device Model (device files
appearing in devtmpfs, which you don’t have with raw character devices).
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 300/437
Misc subsystem diagram
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 301/437
Misc subsystem API (1/2)
▶ The misc subsystem API mainly provides two functions, to register and unregister
a single misc device:
• int misc_register(struct miscdevice * misc);
• void misc_deregister(struct miscdevice *misc);
▶ A misc device is described by a struct miscdevice structure:
struct miscdevice {
int minor;
const char *name;
const struct file_operations *fops;
struct list_head list;
struct device *parent;
struct device *this_device;
const char *nodename;
umode_t mode;
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 302/437
Misc subsystem API (2/2)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 303/437
User space API for misc devices
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 304/437
Practical lab - Output-only serial port driver
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 305/437
Processes, scheduling and interrupts
Processes, scheduling
and interrupts
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 306/437
Processes, scheduling and interrupts
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 307/437
Process, thread?
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 308/437
Process, thread: kernel point of view
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 309/437
Relation between execution mode, address space and context
▶ When speaking about process and thread, these concepts need to be clarified:
• Mode is the level of privilege allowing to perform some operations:
Kernel Mode: in this level CPU can perform any operation allowed by its architecture;
any instruction, any I/O operation, any area of memory accessed.
User Mode: in this level, certain instructions are not permitted (especially those that
could alter the global state of the machine), some memory areas cannot be accessed.
• Linux splits its address space in kernel space and user space
Kernel space is reserved for code running in Kernel Mode.
User space is the place were applications execute (accessible from Kernel Mode).
• Context represents the current state of an execution flow.
The process context can be seen as the content of the registers associated to this
process: execution register, stack register...
The interrupt context replaces the process context when the interrupt handler is
executed.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 310/437
A thread life
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 311/437
Execution of system calls
The execution of system calls takes place in the context of the thread requesting them.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 312/437
Processes, scheduling and interrupts
Sleeping
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 313/437
Sleeping
Sleeping is needed when a process (user space or kernel space) is waiting for data.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 314/437
How to sleep with a wait queue 1/3
▶ Must declare a wait queue, which will be used to store the list of threads waiting
for an event
▶ Dynamic queue declaration:
• Typically one queue per device managed by the driver
• It’s convenient to embed the wait queue inside a per-device data structure.
• Example from drivers/net/ethernet/marvell/mvmdio.c:
struct orion_mdio_dev {
...
wait_queue_head_t smi_busy_wait;
};
struct orion_mdio_dev *dev;
...
init_waitqueue_head(&dev->smi_busy_wait);
▶ Static queue declaration:
• Using a global variable when a global resource is sufficient
• DECLARE_WAIT_QUEUE_HEAD(module_queue);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 315/437
How to sleep with a waitqueue 2/3
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 316/437
How to sleep with a waitqueue 3/3
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 317/437
How to sleep with a waitqueue - Example
sig = wait_event_interruptible(ibmvtpm->wq,
!ibmvtpm->tpm_processing_cmd);
if (sig)
return -EINTR;
From drivers/char/tpm/tpm_ibmvtpm.c
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 318/437
Waking up!
Typically done by interrupt handlers when data sleeping processes are waiting for
become available.
▶ wake_up(&queue);
• Wakes up all processes in the wait queue
▶ wake_up_interruptible(&queue);
• Wakes up all processes waiting in an interruptible sleep on the given queue
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 319/437
Exclusive vs. non-exclusive
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 320/437
Sleeping and waking up - Implementation
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 321/437
How to sleep with completions 1/2
▶ Use wait_for_completion() when no particular condition must be enforced at
the time of the wake-up
• Leverages the power of wait queues
• Simplifies its use
• Highly efficient using low level scheduler facilities
▶ Preparation of the completion structure:
• Static declaration and initialization:
DECLARE_COMPLETION(setup_done);
• Dynamic declaration:
init_completion(&object->setup_done);
• The completion object should get a meaningful name (eg. not just “done”).
▶ Ready to be used by signal consumers and providers as soon as the completion
object is initialized
▶ See include/linux/completion.h for the full API
▶ Internal documentation at scheduler/completion
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 322/437
How to sleep with completions 2/2
▶ Enter a wait state with
void wait_for_completion(struct completion *done)
• All wait_event() flavors are also supported, such as:
wait_for_completion_timeout(),
wait_for_completion_interruptible{,_timeout}(),
wait_for_completion_killable{,_timeout}(), etc
▶ Wake up consumers with
void complete(struct completion *done)
• Several calls to complete() are valid, they will wake up the same number of threads
waiting on this object (acts as a FIFO).
• A single complete_all() call would wake up all present and future threads waiting
on this completion object
▶ Reset the counter with
void reinit_completion(struct completion *done)
• Resets the number of “done” completions still pending
• Mind not to call init_completion() twice, which could confuse the enqueued tasks
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 323/437
Waiting when there is no interrupt
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 324/437
Waiting when hardware is involved
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 325/437
Processes, scheduling and interrupts
Interrupt Management
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 326/437
Registering an interrupt handler 1/2
The managed API is recommended:
int devm_request_irq(struct device *dev, unsigned int irq, irq_handler_t handler,
unsigned long irq_flags, const char *devname, void *dev_id);
▶ device for automatic freeing at device or module release time.
▶ irq is the requested IRQ channel. For platform devices, use platform_get_irq()
to retrieve the interrupt number.
▶ handler is a pointer to the IRQ handler function
▶ irq_flags are option masks (see next slide)
▶ devname is the registered name (for /proc/interrupts). For platform drivers,
good idea to use pdev->name which allows to distinguish devices managed by the
same driver (example: 44e0b000.i2c).
▶ dev_id is an opaque pointer. It can typically be used to pass a pointer to a
per-device data structure. It cannot be NULL as it is used as an identifier for
freeing interrupts on a shared line.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 327/437
Registering an interrupt handler 2/2
Here are the most frequent irq_flags bit values in drivers (can be combined):
▶ IRQF_SHARED: interrupt channel can be shared by several devices.
• When an interrupt is received, all the interrupt handlers registered on the same
interrupt line are called.
• This requires a hardware status register telling whether an IRQ was raised or not.
▶ IRQF_ONESHOT: for use by threaded interrupts (see next slides). Keeping the
interrupt line disabled until the thread function has run.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 328/437
Interrupt handler constraints
▶ No guarantee in which address space the system will be in when the interrupt
occurs: can’t transfer data to and from user space.
▶ Interrupt handler execution is managed by the CPU, not by the scheduler.
Handlers can’t run actions that may sleep, because there is nothing to resume
their execution. In particular, need to allocate memory with GFP_ATOMIC.
▶ Interrupt handlers are run with all interrupts disabled on the local CPU (see
https://lwn.net/Articles/380931). Therefore, they have to complete their job
quickly enough, to avoiding blocking interrupts for too long.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 329/437
/proc/interrupts on Raspberry Pi 2 (ARM, Linux 4.19)
Note: interrupt numbers shown on the left-most column are virtual numbers when the Device Tree is
used. The physical interrupt numbers can be found in /sys/kernel/debug/irq/irqs/<nr> files when
CONFIG_GENERIC_IRQ_DEBUGFS=y.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 330/437
Interrupt handler prototype
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 331/437
Typical interrupt handler’s job
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 332/437
Threaded interrupts
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 333/437
Top half and bottom half processing
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 334/437
Top half and bottom half diagram
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 335/437
Softirqs
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 336/437
Example usage of softirqs - NAPI
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 337/437
Tasklets
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 338/437
Workqueues
▶ Workqueues are a general mechanism for deferring work. It is not limited in usage
to handling interrupts. It can typically be used for background jobs.
▶ Functions registered to run in workqueues are called works:
• They can be created with the macro INIT_WORK()
• When scheduled, they become threads (called workers) running in process context,
which means:
All interrupts are enabled
Sleeping is allowed
• Works can be queued on:
The default workqueue, with schedule_work()
A workqueue allocated by the subsystem or the drivers, with alloc_workqueue()
▶ The complete API is in include/linux/workqueue.h
▶ Example (drivers/crypto/atmel-i2c.c):
INIT_WORK(&work_data->work, atmel_i2c_work_handler);
schedule_work(&work_data->work);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 339/437
Interrupt management summary
▶ Device driver
• In the probe() function, for each device, use devm_request_irq() to register an
interrupt handler for the device’s interrupt channel.
▶ Interrupt handler
• Called when an interrupt is raised.
• Acknowledge the interrupt
• If needed, schedule a per-device tasklet taking care of handling data.
• Wake up processes waiting for the data on a per-device queue
▶ Device driver
• In the remove() function, for each device, the interrupt handler is automatically
unregistered.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 340/437
Practical lab - Interrupts
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 341/437
Concurrent Access to Resources: Locking
Concurrent Access to
Resources: Locking
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 342/437
Sources of concurrency issues
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 343/437
Concurrency protection with locks
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 344/437
Linux mutexes
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 345/437
Locking and unlocking mutexes 1/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 346/437
Spinlocks
▶ Locks to be used for code that is not allowed to sleep (interrupt handlers), or that
doesn’t want to sleep (critical sections). Be very careful not to call functions
which can sleep!
▶ Originally intended for multiprocessor systems
▶ Spinlocks never sleep and keep spinning in a loop until the lock is available.
▶ The critical section protected by a spinlock is not allowed to sleep.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 347/437
The spinlock API
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 348/437
Using spinlocks 1/2
▶ So, kernel preemption on the local CPU is disabled. We need to avoid deadlocks
(and unbounded latencies) because of preemption from processes that want to get
the same lock.
▶ Disabling kernel preemption also disables migration to avoid the same kind of
issue as pictured above from happening.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 349/437
Using spinlocks 2/2
▶ We also need to avoid deadlocks because of interrupts that could want to get the
same lock:
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 351/437
Spinlock example
▶ From drivers/tty/serial/uartlite.c
▶ Spinlock structure embedded into struct uart_port
struct uart_port {
spinlock_t lock;
/* Other fields */
};
▶ Spinlock taken/released with protection against interrupts
spin_lock_irqsave(&port->lock, flags);
/* Do something */
spin_unlock_irqrestore(&port->lock, flags);
}
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 352/437
More deadlock situations
They can lock up your system. Make sure they never happen!
Rule 1: don’t call a function that can try to Rule 2: if you need multiple locks, always
get access to the same lock acquire them in the same order!
Deadlock!
Deadlock!
Wait for Lock 1
Get Lock 2 Get Lock 1
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 353/437
Debugging locking
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 354/437
Concurrency issues
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 355/437
Alternatives to locking
As we have just seen, locking can have a strong negative impact on system
performance. In some situations, you could do without it.
▶ By using lock-free algorithms like Read Copy Update (RCU).
• RCU API available in the kernel
• See https://en.wikipedia.org/wiki/Read-copy-update for a coverage of how it
works.
▶ When relevant, use atomic operations.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 356/437
RCU API
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 357/437
RCU example: ensuring consistent accesses (1/2)
Unsafe read/write
struct myconf { int a, b; } *shared_conf; /* initialized */
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 358/437
RCU example: ensuring consistent accesses (2/2)
Safe read/write with RCU
struct myconf { int a, b; } *shared_conf; /* initialized */
rcu_read_lock();
temp = rcu_dereference(shared_conf);
*cur_a = temp->a;
/* If *shared_conf is updated, temp->a and temp->b will remain consistent! */
*cur_b = temp->b;
rcu_read_unlock();
};
oldconf = rcu_dereference(shared_conf);
newconf->a = new_a;
newconf->b = new_b;
rcu_assign_pointer(shared_conf, newconf);
/* Readers might still have a reference over the old struct here... */
synchronize_rcu();
/* ...but not here! No more readers of the old struct, kfree() is safe! */
kfree(oldconf);
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 359/437
Atomic variables 1/2
#include <linux/atomic.h>
▶ Useful when the shared resource is an integer value
▶ Even an instruction like n++ is not guaranteed to be atomic on all processors!
▶ Ideal for RMW (Read-Modify-Write) operations
▶ Main atomic operations on atomic_t (signed integer, at least 24 bits):
• Set or read the counter:
void atomic_set(atomic_t *v, int i);
int atomic_read(atomic_t *v);
• Operations without return value:
void atomic_inc(atomic_t *v);
void atomic_dec(atomic_t *v);
void atomic_add(int i, atomic_t *v);
void atomic_sub(int i, atomic_t *v);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 360/437
Atomic variables 2/2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 361/437
Atomic bit operations
▶ Supply very fast, atomic operations
▶ On most platforms, apply to an unsigned long * type.
▶ Apply to a void * type on a few others.
▶ Ideal for bitmaps
▶ Set, clear, toggle a given bit:
• void set_bit(int nr, unsigned long *addr);
• void clear_bit(int nr, unsigned long *addr);
• void change_bit(int nr, unsigned long *addr);
▶ Test bit value:
• int test_bit(int nr, unsigned long *addr);
▶ Test and modify (return the previous value):
• int test_and_set_bit(...);
• int test_and_clear_bit(...);
• int test_and_change_bit(...);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 362/437
Kernel locking: summary and references
Further reading: see the classical
dining philosophers problem for a
▶ Use mutexes in code that is allowed to sleep nice illustration of synchronization
▶ Use spinlocks in code that is not allowed to sleep and concurrency issues.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 363/437
Practical lab - Locking
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 364/437
Direct Memory Access
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 365/437
Direct Memory Access
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 366/437
DMA integration
DMA (Direct Memory Access) is used to copy data directly between devices and RAM,
without going through the CPU.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 367/437
Peripheral DMA
Some device controllers embedded their own DMA controller and therefore can do
DMA on their own.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 368/437
DMA controllers
Other device controllers rely on an external DMA controller (on the SoC). Their drivers
need to submit DMA descriptors to this controller.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 369/437
DMA descriptors
DMA descriptors describe the various attributes of a DMA transfer, and are chained.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 370/437
Cache constraints
▶ The CPU can access memory through a data cache
• Using the cache can be more efficient (faster accesses to the cache than the bus)
▶ But the DMA does not access the CPU cache, so one needs to take care of cache
coherency (cache content vs. memory content):
• When the CPU reads from memory accessed by DMA, the relevant cache lines must
be invalidated to force reading from memory again
• When the CPU writes to memory before starting DMA transfers, the cache lines
must be flushed/cleaned in order to force the data to reach the memory
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 371/437
DMA addressing constraints
▶ Memory and devices have physical addresses: phys_addr_t
▶ CPUs usually access memory through an MMU, using virtual pointers: void *
▶ DMA controllers do not access memory through the MMU and thus cannot
manipulate virtual addresses, instead they access a dma_addr_t through either:
• physical addresses directly
• an IOMMU, in which case a specific mapping must be created
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 372/437
DMA memory allocation constraints
The APIs must remain generic and handle all cases transparently, hence:
▶ Each memory chunk accessed by the DMA shall be physically contiguous, which
means one can use:
• any memory allocated by kmalloc() (up to 128 KB)
• any memory allocated by __get_free_pages() (up to 8MB)
• block I/O and networking buffers, designed to support DMA
▶ Unless the buffer is smaller than one page, one cannot use:
• kernel memory allocated with vmalloc()
• user memory allocated with malloc()
Almost all the time userspace relies on the kernel to allocate the buffers and mmap()
them to be usable from userspace (requires a dedicated user API)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 373/437
Direct Memory Access
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 374/437
dma-mapping vs. dmaengine vs. dma-buf
The dma-mapping API:
▶ Allocates and manages DMA buffers
▶ Offers generic interfaces to handle coherency
▶ Manages IO-MMU DMA mappings when relevant
▶ See core-api/dma-api and core-api/dma-api-howto
The dmaengine API:
▶ Abstracts the DMA controller
▶ Offers generic functions to configure, queue, trigger, stop transfers
▶ Unused when dealing with peripheral DMA
▶ See driver-api/dmaengine/client and
The dma-buf API:
▶ Enables sharing DMA buffers between devices within the kernel
▶ Not covered in this training
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 375/437
dma-mapping: Coherent or streaming DMA mappings
▶ Coherent mappings
• The kernel allocates a suitable buffer and sets the mapping for the driver
• Can simultaneously be accessed by the CPU and device
• So, has to be in a cache coherent memory area
• Usually allocated for the whole time the module is loaded
Can be expensive to setup and use on some platforms
Typically implemented by disabling cache on ARM
▶ Streaming mappings
• Use an already allocated buffer
• The driver provides a buffer, the kernel just sets the mapping
• Mapping set up for each transfer (keeps DMA registers free on the hardware)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 376/437
dma-mapping: memory addressing constraints
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 377/437
dma-mapping: Allocating coherent memory mappings
#include <linux/dma-mapping.h>
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 378/437
dma-mapping: Setting up streaming memory mappings (single)
#include <linux/dma-mapping.h>
dma_addr_t dma_map_single(
struct device *, /* device structure */
void *, /* input: buffer to use */
size_t, /* buffer size */
enum dma_data_direction /* Either DMA_BIDIRECTIONAL,
* DMA_TO_DEVICE or
* DMA_FROM_DEVICE */
);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 379/437
dma-mapping: Setting up streaming memory mappings (multiples)
A scatterlist using the scatter-gather library can be used to map several buffers
and link them together
#include <linux/dma-mapping.h>
#include <linux/scatterlist.h>
sg_init_table(sglist, NENTS);
sg_set_buf(&sglist[0], buf0, len0);
sg_set_buf(&sglist[1], buf1, len1);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 380/437
dma-mapping: Setting up streaming I/O mappings
#include <linux/dma-mapping.h>
dma_addr_t dma_map_resource(
struct device *, /* device structure */
phys_addr_t, /* input: resource to use */
size_t, /* buffer size */
enum dma_data_direction, /* Either DMA_BIDIRECTIONAL,
* DMA_TO_DEVICE or
* DMA_FROM_DEVICE */
unsigned long attrs, /* optional attributes */
);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 381/437
dma-mapping: Verifying DMA memory mappings
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 382/437
dma-mapping: Syncing streaming DMA mappings
▶ In general streaming mappings are:
• mapped right before use with DMA
MEM_TO_DEV: caches are flushed
• unmapped right after
DEV_TO_MEM: cache lines are invalidated
▶ The CPU shall only access the buffer after unmapping!
▶ If however the same memory region has to be used for several DMA transfers, the
same mapping can be kept in place. In this case the data must be synchronized
before access:
• The CPU needs to access the data:
dma_sync_single_for_cpu(dev, dma_handle, size, direction);
dma_sync_sg_for_cpu(dev, sglist, nents, direction);
• The device needs to access the data:
dma_sync_single_for_device(dev, dma_handle, size, direction);
dma_sync_sg_for_device(dev, sglist, nents, direction);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 383/437
Starting DMA transfers
▶ If the device you’re writing a driver for is doing peripheral DMA, no external API
is involved.
▶ If it relies on an external DMA controller, you’ll need to
1. Ask the hardware to use DMA, so that it will drive its request line
2. Use Linux dmaengine framework, especially its slave API
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 384/437
The dmaengine framework
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 385/437
dmaengine: Slave API: Initial configuration
Steps to start a DMA transfer with dmaengine:
1. Request a channel for exclusive use with dma_request_chan(), or one of its
variants
• This channel pointer will be used all along
• Returns a pointer over a struct dma_chan which can also be an error pointer
2. Configure the engine by filling a struct dma_slave_config structure and passing
it to dmaengine_slave_config():
struct dma_slave_config txconf = {};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 386/437
dmaengine: Slave API: Per-transfer configuration (1/2)
1. Create a descriptor with all the required configuration for the next transfer with:
struct dma_async_tx_descriptor *
dmaengine_prep_slave_single(struct dma_chan *chan, dma_addr_t buf,
size_t len, enum dma_transfer_direction dir,
unsigned long flags);
struct dma_async_tx_descriptor *
dmaengine_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
unsigned int sg_len, enum dma_transfer_direction dir,
unsigned long flags);
struct dma_async_tx_descriptor *
dmaengine_prep_dma_cyclic(struct dma_chan *chan, dma_addr_t buf, size_t buf_len,
size_t period_len, enum dma_data_direction dir);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 387/437
dmaengine: Slave API: Per-transfer configuration (2/2)
cookie = dmaengine_submit(desc);
ret = dma_submit_error(cookie);
if (ret)
...
3. Trigger the queued transfers
dma_async_issue_pending(chan);
3bis. In case anything went wrong or the device should stop being used, it is possible to
terminate all ongoing transactions with:
dmaengine_terminate_sync(chan);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 388/437
Examples
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 389/437
Practical lab - DMA
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 390/437
Kernel debugging
Kernel debugging
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 391/437
Debugging using messages (1/3)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 392/437
Debugging using messages (2/3)
▶ The dev_*() family of functions: dev_emerg(), dev_alert(), dev_crit(),
dev_err(), dev_warn(), dev_notice(), dev_info()
and the special dev_dbg() (see next page)
• They take a pointer to struct device as first argument, and then a format string
with arguments
• Defined in include/linux/dev_printk.h
• To be used in drivers integrated with the Linux device model
• Example:
dev_info(&pdev->dev, "in probe\n");
• Here’s what you get in the kernel log:
[ 25.878382] serial 48024000.serial: in probe
[ 25.884873] serial 481a8000.serial: in probe
▶ *_ratelimited() version exists which limits the amount of print if called too
much based on /proc/sys/kernel/printk_ratelimit{_burst} values
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 393/437
Debugging using messages (3/3)
▶ The kernel defines many more format specifiers than the standard printf()
existing ones.
• %p: Display the hashed value of pointer by default.
• %px: Always display the address of a pointer (use carefully on non-sensitive
addresses).
• %pK: Display hashed pointer value, zeros or the pointer address depending on
kptr_restrict sysctl value.
• %pOF: Device-tree node format specifier.
• %pr: Resource structure format specifier.
• %pa: Physical address display (work on all architectures 32/64 bits)
• %pe: Error pointer (displays the string corresponding to the error number)
▶ /proc/sys/kernel/kptr_restrict should be set to 1 in order to display pointers
which uses %pK
▶ See core-api/printk-formats for an exhaustive list of supported format
specifiers
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 394/437
pr_debug() and dev_dbg()
▶ When the driver is compiled with DEBUG defined, all these messages are compiled
and printed at the debug level. DEBUG can be defined by #define DEBUG at the
beginning of the driver, or using ccflags-$(CONFIG_DRIVER) += -DDEBUG in the
Makefile
▶ When the kernel is compiled with CONFIG_DYNAMIC_DEBUG, then these messages
can dynamically be enabled on a per-file, per-module or per-message basis, by
writing commands to /proc/dynamic_debug/control. Note that messages are
not enabled by default.
• Details in admin-guide/dynamic-debug-howto
• Very powerful feature to only get the debug messages you’re interested in.
▶ When neither DEBUG nor CONFIG_DYNAMIC_DEBUG are used, these messages are not
compiled in.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 395/437
Configuring the priority
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 396/437
DebugFS
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 397/437
DebugFS API
▶ Create a sub-directory for your driver:
• struct dentry *debugfs_create_dir(const char *name,
struct dentry *parent);
▶ Expose an integer as a file in DebugFS. Example:
• struct dentry *debugfs_create_u8
(const char *name, mode_t mode, struct dentry *parent,
u8 *value);
u8, u16, u32, u64 for decimal representation
x8, x16, x32, x64 for hexadecimal representation
▶ Expose a binary blob as a file in DebugFS:
• struct dentry *debugfs_create_blob(const char *name,
mode_t mode, struct dentry *parent,
struct debugfs_blob_wrapper *blob);
▶ Also possible to support writable DebugFS files or customize the output using the
more generic debugfs_create_file() function.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 398/437
Deprecated debugging mechanisms
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 399/437
Using Magic SysRq
Functionnality provided by serial drivers
▶ Allows to run multiple debug / rescue commands even when the kernel seems to
be in deep trouble
• On PC: press [Alt] + [Prnt Scrn] + <character> simultaneously
([SysRq] = [Alt] + [Prnt Scrn])
• On embedded: in the console, send a break character
(Picocom: press [Ctrl] + a followed by [Ctrl] + \ ), then press <character>
▶ Example commands:
• h: show available commands
• s: sync all mounted filesystems
• b: reboot the system
• n: makes RT processes nice-able.
• w: shows the kernel stack of all sleeping processes
• t: shows the kernel stack of all running processes
• You can even register your own!
▶ Detailed in admin-guide/sysrq
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 400/437
kgdb - A kernel debugger
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 401/437
Using kgdb (1/2)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 402/437
Using kgdb (2/2)
▶ Then also pass kgdbwait to the kernel: it makes kgdb wait for a debugger
connection.
▶ Boot your kernel, and when the console is initialized, interrupt the kernel with a
break character and then g in the serial console (see our Magic SysRq
explanations).
▶ On your workstation, start gdb as follows:
• arm-linux-gdb ./vmlinux
• (gdb) set remotebaud 115200
• (gdb) target remote /dev/ttyS0
▶ Once connected, you can debug a kernel the way you would debug an application
program.
▶ On GDB side, the first threads represent the CPU context (ShadowCPU<x>),
then all the other threads represents a task.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 403/437
Debugging with a JTAG interface
Two types of JTAG dongles
▶ The ones offering a gdb compatible interface, over a serial port or an Ethernet
connection. gdb can directly connect to them.
▶ The ones not offering a gdb compatible interface are generally supported by
OpenOCD (Open On Chip Debugger): http://openocd.sourceforge.net/
• OpenOCD is the bridge between the gdb debugging language and the JTAG
interface of the target CPU.
• See the very complete documentation:
https://openocd.org/pages/documentation.html
• For each board, you’ll need an OpenOCD configuration file (ask your supplier)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 404/437
Early traces
▶ If something breaks before the tty layer, serial driver and serial console are
properly registered, you might just have nothing else after ”Starting kernel...”
▶ On ARM, if your platform implements it, you can activate (CONFIG_DEBUG_LL and
CONFIG_EARLYPRINTK), and add earlyprintk to the kernel command line
• Assembly routines to just push a character and wait for it to be sent
• Extremely basic, but is part of the uncompressed section, so available even if the
kernel does not uncompress correctly!
▶ On other platforms, hoping that your serial driver implements
OF_EARLYCON_DECLARE(), you can enable CONFIG_SERIAL_EARLYCON
• The kernel will try to hook an appropriate earlycon UART driver using the
stdout-path of the device-tree.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 405/437
More kernel debugging tips
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 406/437
Getting help and reporting bugs
▶ If you are using a custom kernel from a hardware vendor, contact that company.
The community will have less interest supporting a custom kernel.
▶ Otherwise, or if this doesn’t work, try to reproduce the issue on the latest version
of the kernel.
▶ Make sure you investigate the issue as much as you can: see
admin-guide/bug-bisect
▶ Check for previous bugs reports. Use web search engines, accessing public mailing
list archives.
▶ If you’re the first to face the issue, it’s very useful for others to report it, even if
you cannot investigate it further.
▶ If the subsystem you report a bug on has a mailing list, use it. Otherwise, contact
the official maintainer (see the MAINTAINERS file). Always give as many useful
details as possible.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 407/437
Practical lab - Kernel debugging
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 408/437
Power Management
Power Management
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 409/437
PM building blocks
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 410/437
Clock framework (1)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 411/437
Clock framework (2)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 412/437
Diagram overview of the common clock framework
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 413/437
Clock framework (3)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 414/437
Clock framework (4)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 415/437
Suspend and resume (to / from RAM)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 416/437
Triggering suspend / hibernate
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 417/437
Saving power in the idle loop
▶ The idle loop is what you run when there’s nothing left to run in the system.
▶ arch_cpu_idle() implemented in all architectures in
arch/<arch>/kernel/process.c
▶ Example: arch/arm/kernel/process.c
▶ The CPU can run power saving HLT instructions, enter NAP mode, and even
disable the timers (tickless systems).
▶ See also https://en.wikipedia.org/wiki/Idle_loop
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 418/437
Managing idle
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 419/437
PowerTOP
https://en.wikipedia.org/wiki/PowerTOP
▶ With dynamic ticks, allows to fix parts of kernel code and applications that wake
up the system too often.
▶ PowerTOP allows to track the worst offenders
▶ Now available on ARM cpus implementing CPUidle
▶ Also gives you useful hints for reducing power.
▶ Try it on your x86 laptop:
sudo powertop
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 420/437
Runtime power management
▶ Managing per-device idle, each device being managed by its device driver
independently from others.
▶ According to the kernel configuration interface: Enable functionality allowing I/O
devices to be put into energy-saving (low power) states at run time (or
autosuspended) after a specified period of inactivity and woken up in response to
a hardware-generated wake-up event or a driver’s request.
▶ New hooks must be added to the drivers: runtime_suspend(),
runtime_resume(), runtime_idle() in the struct dev_pm_ops structure in
struct device_driver.
▶ API and details on power/runtime_pm
▶ See drivers/net/ethernet/cadence/macb_main.c again.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 421/437
Generic PM Domains (genpd)
▶ Generic infrastructure to implement power domains based on Device Tree
descriptions, allowing to group devices by the physical power domain they belong
to. This sits at the same level as bus type for calling PM hooks.
▶ All the devices in the same PD get the same state at the same time.
▶ Specifications and examples available at
Documentation/devicetree/bindings/power/power_domain.txt
▶ Driver example: drivers/soc/rockchip/pm_domains.c
(rockchip_pd_power_on(), rockchip_pd_power_off(),
rockchip_pm_add_one_domain()...)
▶ DT example: look for rockchip,px30-power-controller
(arch/arm64/boot/dts/rockchip/px30.dtsi) and find PD definitions and
corresponding devices.
▶ See Kevin Hilman’s talk at Kernel Recipes 2017:
https://youtu.be/SctfvoskABM
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 422/437
Frequency and voltage scaling (1)
Frequency and voltage scaling possible through the cpufreq kernel infrastructure.
▶ Generic infrastructure: drivers/cpufreq/cpufreq.c and
include/linux/cpufreq.h
▶ Generic governors, responsible for deciding frequency and voltage transitions
• performance: maximum frequency
• powersave: minimum frequency
• ondemand: measures CPU consumption to adjust frequency
• conservative: often better than ondemand. Only increases frequency gradually
when the CPU gets loaded.
• schedutil: Tightly integrated with the scheduler, making per-policy decisions, RT
tasks running at full speed.
• userspace: leaves the decision to a user space daemon.
▶ This infrastructure can be controlled from
/sys/devices/system/cpu/cpu<n>/cpufreq/
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 423/437
Frequency and voltage scaling (2)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 424/437
Regulator framework
▶ Modern embedded platforms have hardware responsible for voltage and current
regulation
▶ The regulator framework allows to take advantage of this hardware to save power
when parts of the system are unused
• A consumer interface for device drivers (i.e. users)
• Regulator driver interface for regulator drivers
• Machine interface for board configuration
• sysfs interface for user space
▶ See power/regulator/ in kernel documentation.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 425/437
BSP work for a new board
In case you just need to create a BSP for your board, and your CPU already has full
PM support, you should just need to:
▶ Create clock definitions and bind your devices to them.
▶ Implement PM handlers (suspend, resume) in the drivers for your board specific
devices.
▶ Implement runtime PM handlers in your drivers.
▶ Implement board specific power management if needed (mainly battery
management)
▶ Implement regulator framework hooks for your board if needed.
▶ Attach on-board devices to PM domains if needed
▶ All other parts of the PM infrastructure should be already there: suspend /
resume, cpuidle, cpu frequency and voltage scaling, PM domains.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 426/437
Useful resources
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 427/437
Kernel Resources
Kernel Resources
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 428/437
Kernel Development News
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 429/437
Useful Online Resources
▶ Kernel documentation
• https://kernel.org/doc/
▶ Linux kernel mailing list FAQ
• http://vger.kernel.org/lkml/
• Complete Linux kernel FAQ
• Read this before asking a question to the mailing list
▶ Linux kernel mailing lists
• http://lore.kernel.org/
• Easy browsing and referencing of all e-mail threads
• Easy access to an mbox in order to answer to e-mails you were not Cc’ed to
▶ Kernel Newbies
• https://kernelnewbies.org/
• Articles, presentations, HOWTOs, recommended reading, useful tools for people
getting familiar with Linux kernel or driver development.
• Glossary: https://kernelnewbies.org/KernelGlossary
• In depth coverage of the new features in each kernel release:
https://kernelnewbies.org/LinuxChanges
▶ The https://elinux.org wiki
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 430/437
International Conferences (1)
▶ Embedded Linux Conference:
• https://embeddedlinuxconference.com/
• Organized by the Linux Foundation
• Once per year, alternating North America/Europe
• Very interesting kernel and user space topics for embedded
systems developers. Many kernel and embedded project
maintainers are present.
• Presentation slides and videos freely available on
https://elinux.org/ELC_Presentations
▶ Linux Plumbers
• https://linuxplumbersconf.org
• About the low-level plumbing of Linux: kernel, audio, power
management, device management, multimedia, etc.
• Not really a conventional conference with formal
presentations, but rather a place where contributors on each
topic meet, share their progress and make plans for work
ahead.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 431/437
International Conferences (2)
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 432/437
After the course
Hobbyists can make their first
contributions by:
Continue to learn: ▶ Helping with tasks keeping the kernel
▶ Run your labs again on your own code clean and up-to-date:
hardware. The Nunchuk lab should be https://kernelnewbies.org/
rather straightforward, but the serial KernelJanitors/Todo
lab will be quite different if you use a ▶ Proposing fixes for issues reported by
different processor. the Coccinelle tool:
▶ Learn by reading the kernel code by make coccicheck
yourself, ask questions and propose ▶ Participating to improving drivers in
improvements. drivers/staging/
▶ Implement and share drivers for your ▶ Investigating and do the triage of
own hardware, of course! issues reported by Coverity Scan:
https://scan.coverity.com/
projects/linux
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 433/437
Contribute your changes
Recommended resources
▶ See process/submitting-patches for guidelines and
https://kernelnewbies.org/UpstreamMerge for very helpful advice to have your
changes merged upstream (by Rik van Riel).
▶ Watch the Write and Submit your first Linux kernel Patch talk by Greg. K.H:
https://www.youtube.com/watch?v=LLBrBBImJt4
▶ How to Participate in the Linux Community (by Jonathan Corbet).
A guide to the kernel development process.
http://www.static.linuxfound.org/sites/lfcorp/files/How-Participate-Linux-Community_0.pdf
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 434/437
Last slides
Last slides
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 435/437
Last slide
Thank you!
And may the Source be with you
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 436/437
Rights to copy
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 437/437
Backup slides
Backup slides
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 1/16
Backup slides
mmap
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 2/16
mmap
▶ Possibility to have parts of the virtual address space of a program mapped to the
contents of a file
▶ Particularly useful when the file is a device file
▶ Allows to access device I/O memory and ports without having to go through
(expensive) read, write or ioctl calls
▶ One can access to current mapped files by two means:
• /proc/<pid>/maps
• pmap <pid>
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 3/16
/proc/<pid>/maps
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 4/16
mmap overview
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 5/16
How to Implement mmap - User space
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 6/16
How to Implement mmap - Kernel space
▶ Character driver: implement an mmap file operation and add it to the driver file
operations:
int (*mmap) (
struct file *, /* Open file structure */
struct vm_area_struct * /* Kernel VMA structure */
);
▶ Initialize the mapping.
• Can be done in most cases with the remap_pfn_range() function, which takes care
of most of the job.
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 7/16
remap_pfn_range()
int remap_pfn_range(
struct vm_area_struct *, /* VMA struct */
unsigned long virt_addr, /* Starting user
* virtual address */
unsigned long pfn, /* pfn of the starting
* physical address */
unsigned long size, /* Mapping size */
pgprot_t prot /* Page permissions */
);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 8/16
Simple mmap implementation
static int acme_mmap
(struct file * file, struct vm_area_struct *vma)
{
size = vma->vm_end - vma->vm_start;
if (remap_pfn_range(vma,
vma->vm_start,
ACME_PHYS >> PAGE_SHIFT,
size,
vma->vm_page_prot))
return -EAGAIN;
return 0;
}
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 9/16
devmem2
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 10/16
mmap summary
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 11/16
Backup slides
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 12/16
Memory/string utilities
▶ In include/linux/string.h
• Memory-related: memset(), memcpy(), memmove(), memscan(), memcmp(), memchr()
• String-related: strcpy(), strcat(), strcmp(), strchr(), strrchr(), strlen()
and variants
• Allocate and copy a string: kstrdup(), kstrndup()
• Allocate and copy a memory area: kmemdup()
▶ In include/linux/kernel.h
• String to int conversion: simple_strtoul(), simple_strtol(),
simple_strtoull(), simple_strtoll()
• Other string functions: sprintf(), sscanf()
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 13/16
Linked lists
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 14/16
Linked lists examples 1/2
From include/soc/at91/atmel_tcb.h
/*
* Definition of a list element, with a
* struct list_head member
*/
struct atmel_tc
{
/* some members */
struct list_head node;
};
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 15/16
Linked lists examples 2/2
From drivers/misc/atmel_tclib.c
/* Define the global list */
static LIST_HEAD(tc_list);
- Kernel, drivers and embedded Linux - Development, consulting, training and support - https://bootlin.com 16/16