Linux Kernel Development
Linux Kernel Development
com
December 2010
..................
Jonathan Corbet, LWN.net
Greg Kroah-Hartman, SuSE Labs / Novell Inc.
Amanda McPherson, The Linux Foundation
Since 2005, over 6100 individual developers from over 600 different companies have contributed
to the kernel. The Linux kernel, thus, has become a common resource developed on a massive
scale by companies which are fierce competitors in other areas.
The first version of this study was published in 2008; it was then updated in 2009. That update noted
a number of changes, including a 10% increase in the number of developers participating in each
release cycle, a notable increase in the number of companies participating, and a tripling of the
rate at which code is being added to the kernel. At that time, all of the numbers were in a period
of rapid increase.
As documented in the last paper, in 2009 the Linux community saw, with the release of 2.6.30, a
peak in the lines of code added. This can largely be attributed to significant new features being
added to the kernel, most notably the first additions of Btrfs, perf and ftrace, as well as the peak of
the inflow from the Linux-staging tree that had been happening for some time.
This update shows a slightly different picture. The number of commits peaked with the 2.6.30
release; the number of commits for 2.6.35 was 18% lower. Most other metrics have fallen as well.
In short, we see a step back from the frenzied activity of 2.6.30 even though the number of
developers involved has fallen only slightly since its peak in 2.6.32.
The numbers in this edition of the paper reflect the natural development cycle of an operating
system that had major pieces added/changed in the previous year. Of course the Linux kernel
community is still hard at work and growing. In fact, there have been 1.5 million lines of code
added to the kernel since the 2009 update. Since the last paper, additions and changes translate
to an amazing 9,058 lines added, 4,495 lines removed, and 1,978 lines changed every day -
weekends and holidays included.
The data in this year’s update also shows a good showing of new players in the Linux kernel
development space from the world of mobile/consumer electronics and embedded technology
(and their suppliers). This is a healthy development and not surprising given the growth of Linux
usage in embedded devices, even though the authors would like to see more companies from
that space participate in the Linux development community.
The overall picture shows a robust development community which continues to grow both in size
and in productivity.
The Linux kernel is an interesting project to study for a number of reasons. It is one of the largest
individual components on almost any Linux system. It also features one of the fastest-moving
development processes and involves more developers than any other open source project. Since
2005, kernel development history is also quite well documented, thanks to the use of the Git source
code management system.
This paper takes advantage of that development history to look at how the process works,
focusing on over five years of kernel history as represented by the 2.6.11 through 2.6.35
releases. This is the third version of this paper, following up on http://www.linuxfoundation.org/
sites/main/files/publications/linuxkerneldevelopment.pdf the original study which was published
in April, 2008, and http://www.linuxfoundation.org/sites/main/files/publications/whowriteslinux.pdf
the 2009 update, which looked at the history through the 2.6.30 release. A look at the five kernel
releases which have happened since then shows that, while many things remain the same, others
are changing.
Development Model
Linux kernel development proceeds under a loose, time-based release model, with a new major
kernel release occurring every 2-3 months. This model, which was first formalized in 2005, gets new
features into the mainline kernel and out to users with a minimum of delay. That, in turn, speeds
the pace of development and minimizes the number of external changes that distributors need to
apply. As a result, distributor kernels contain relatively few distribution-specific changes; this leads
to higher quality and fewer differences between distributions.
One significant change since the initial version of this paper is the establishment of the linux-
next tree. Linux-next serves as a staging area for the next kernel development cycle; as of this
writing, 2.6.36 is in the stabilization phase, so linux-next contains changes intended for 2.6.37. This
repository gives developers a better view of which changes are coming in the future and helps
them to ensure that there will be a minimum of integration problems when the next development
cycle begins. Linux-next smooths out the development cycle, helping it to scale to higher rates of
change.
After each mainline 2.6 release, the kernel’s “stable team” (currently Greg Kroah-Hartman) takes
up short-term maintenance, applying important fixes as they are developed. The stable process
Release Frequency
The desired release period for a major kernel release is, by common consensus, 8-12 weeks. A
much-shorter period would not give testers enough times to find problems with new kernels,
while a longer period would allow too much work to pile up between releases. The actual time
between kernel releases tends to vary a bit, depending on the size of the release and the difficulty
encountered in tracking down the last regressions. Since 2.6.11, the actual kernel release history
looks like:
The average kernel development cycle currently runs for 81 days, just under twelve weeks.
12000
Changes (Patches)
10000
8000
6000
4000
2000
0
2.6.11 2.6.12 2.6.13 2.6.14 2.6.15 2.6.16 2.6.17 2.6.18 2.6.19 2.6.20 2.6.21 2.6.22 2.6.23 2.6.24 2.6.25 2.6.26 2.6.27 2.6.28 2.6.29 2.6.30 2.6.31 2.6.32 2.6.33 2.6.34 2.6.35
Kernel Version
By taking into account the amount of time required for each kernel release, one can arrive at the
number of changes accepted into the kernel per hour. The results can be seen in this table:
So, between the 2.6.11 and 2.6.35 kernel releases (which were 1902 days apart), there were, on
average, 4.02 patches applied to the kernel tree per hour. In the time since the publication of the
previous version of this paper, that rate has been significantly higher: 5.18 patches per hour. As the
Linux kernel grows, the rate of change is growing with it.
The rate of change has slowed slightly from the rate (5.45 patches/hour) reported in the 2009
update; the peak rate seen with the 2.6.30 kernel release has not been repeated. Development
rates are naturally variable, and the rates for the kernel have never increased in a monotonic
fashion; that said, the rate of change has remained notably lower for the last year. There are a
couple of explanations for that trend:
The kernels since 2.6.30 have seen the completion and stabilization of a number of long-term
projects, including the ext4 and btrfs filesystems, the addition of the ftrace and perf events
subsystems, and the reimplementation of our graphics layer. Rates of change will naturally
slow as the finishing touches are put on these developments.
The addition of the staging tree in 2.6.28 began a process of merging a large amount of out-of-
tree code into the mainline kernel. By the 2.6.31 development cycle, that process was slowing
down as the backlog of code was taken care of. There are still new drivers entering the kernel
via the staging tree, but they are now arriving at a rate which more closely reflects the actual
rate of development.
The burst of activity caused by the staging tree is not likely to be repeated anytime soon, but the
pace of kernel development as a whole can be expected to increase as developers take on new
challenges in the future.
It is also worth noting that the above figures understate the total level of activity; most patches
go through a number of revisions before being accepted into the mainline kernel, and many are
never accepted at all. The ability to sustain this rate of change for years is unprecedented in any
previous public software project.
Stable Updates
As mentioned toward the beginning of this document, kernel development does not stop with a
mainline release. Inevitably, problems will be found in released kernels, and patches will be made
to fix those problems. The stable kernel update process was designed to capture those patches
in a way that ensures that both the mainline kernel and current releases are fixed. These stable
The stable kernel update history (since the stable kernel process was introduced after the 2.6.11
release) looks like this:
As can be seen, the number of updates going into stable kernels has grown over the years. The
main driver for this increase is a much higher level of discipline in the development community:
we have gotten much better at evaluating patches and identifying those which are applicable
to releaased kernels. Additionally, some kernels are receiving stable updates for relatively long
periods of time; the 2.6.27 kernel is still being updated as of this writing.
With just over five years of history, the stable update series has proven its value by allowing the final
fixes to be made to released kernels while, simultaneously, letting mainline development move
forward.
The information in the following table shows the number of files and lines in each kernel version.
Since the first version of this paper, the kernel has grown by almost 6.7 million lines of code - 1.5
million since the 2009 update. But the kernel is not just growing. With every change that is made
to the kernel source tree, lines are added, modified, and deleted in order to accomplish the
needed changes. Looking at these numbers, broken down by days, shows how quickly the kernel
source tree is being worked on over time. This can be seen in the following table:
Summing up these numbers, it comes to an impressive 6,683 lines added, 3,774 lines removed,
and 1,797 lines changed every day for the past 5.5 years. Since 2.6.30, those numbers jump to an
amazing 9,058 lines added, 4,495 lines removed, and 1,978 lines changed every day - weekends
and holidays included. That rate of change is larger than any other public software project of any
size.
These numbers show a steady increase in the number of developers contributing to each kernel
release over a period of several years.
Despite the large number of individual developers, there is still a relatively small number who
are doing the majority of the work. In any given development cycle, approximately 1/3 of
the developers involved contribute exactly one patch. Over the past 5.5 years, the top 10
individual developers have contributed 10% of the total changes and the top 30 developers have
contributed almost 22% of the total. The list of individual developers, the number of changes they
have contributed, and the percentage of the overall total can be seen here:
The above numbers are drawn from the entire git repository history, starting with 2.6.12. If we look
at the commits since the second version of this paper (2.6.30) through 2.6.35, the picture is similar
but not identical:
It is amusing to note that Linus Torvalds (886 total changes, 168 since 2.6.30) does not appear in the
top-30 list. Linus remains an active and crucial part of the development process; his contribution
cannot be measured just by the number of changes made. We are seeing a similar pattern with a
number of other senior kernel developers; as they put more time into the review and management
of patches from others, they write fewer patches of their own. (Obscure technical detail: these
numbers do not count “merge commits,” where one set of changes is merged into another. Linus
Torvalds generates large numbers of merge commits; had these been counted he would have
shown up on this list.)
There are a number of developers for whom we were unable to determine a corporate affiliation;
those are grouped under “unknown” in the table below. With few exceptions, all of the people in
this category have contributed ten or fewer changes to the kernel over the past three years, yet
the large number of these developers causes their total contribution to be quite high.
The category “None,” instead, represents developers who are known to be doing this work on their
own, with no financial contribution happening from any company.
The top 10 contributors, including the groups “unknown” and “none” make up nearly 70% of
the total contributions to the kernel. It is worth noting that, even if one assumes that all of the
“unknown” contributors were working on their own time, over 70% of all kernel development is
demonstrably done by developers who are being paid for their work.
What we see here is that a small number of companies is responsible for a large portion of the
total changes to the kernel. But there is a “long tail” of companies (over 500 of which do not
appear in the above list) which have made significant changes. There may be no other examples
of such a large, common resource being supported by such a large group of independent actors
in such a collaborative way.
The companies at the top of the listing are almost the same, and Red Hat maintains its
commanding lead here. But we see companies like Nokia, AMD, Texas Instruments, and
Samsung working up to higher contribution levels as they increase their investment in Linux kernel
development.
This rise in development of Linux sponsored by embedded/mobile companies and their suppliers
reflects the increasing importance of Linux in those markets.
An interesting (if approximate) view of kernel development can be had by looking at signoff lines,
and, in particular, at signoff lines added by developers who are not the original authors of the
patches in question. These additional signoffs are usually an indication of review by a subsystem
maintainer. Analysis of signoff lines gives a picture of who admits code into the kernel - who the
gatekeepers are. Since 2.6.30, the developers who added the most non-author signoff lines are:
From this table, we see that Linus Torvalds directly merges just over 1% of the total patch stream;
everything else comes in by way of the subsystem maintainers.
The signoff metric is a loose indication of review, so the above numbers need to be regarded
as approximations only. Still, one can clearly see that subsystem maintainers are rather more
concentrated than kernel developers as a whole; over half of the patches going into the kernel
pass through the hands of developers employed by just two companies.
There are a number of good reasons for companies to support the Linux kernel. As a result, Linux
has a broad base of support which is not dependent on any single company. Even if the largest
contributor were to cease participation tomorrow, the Linux kernel would remain on a solid footing
with a large and active development community.
There are enough companies participating to fund the bulk of the development effort, even if
many companies which could benefit from contributing to Linux have, thus far, chosen not to.
With the current expansion of Linux in the server, desktop and embedded markets, it’s reasonable
to expect this number of contributing companies – and individual developers – will continue to
increase over time. The kernel development community welcomes new developers; individuals
or corporations interested in contributing to the Linux kernel are encouraged to consult “How to
participate in the Linux community” (which can be found at http://ldn.linuxfoundation.org/book/
how-participate-linux-community) or to contact the authors of this paper or the Linux Foundation
for more information.
Thanks
The authors would like to thank the thousands of individual kernel contributors, without them,
papers like this would not be interesting to anyone.
Resources
Many of the statistics in this article were generated by the “gitdm” tool, written by Jonathan
Corbet. Gitdm is distributable under the GNU GPL; it can be obtained from git://git.lwn.net/gitdm.
git.
The information for this paper was retrieved directly from the Linux kernel releases as found at
the http://kernel.org/ web site and from the git kernel repository. Some of the logs from the git
repository were cleaned up by hand due to email addresses changing over time, and minor typos
in authorship information. A spreadsheet was used to compute a number of the statistics. All
of the logs, scripts, and spreadsheet can be found at http://www.kernel.org/pub/linux/kernel/
people/gregkh/kernel_history/
Ubuntu
Fedora
RedHat
SUSE
Linux
Android
Linux - - - -
LinuxIDC
[6688.CC]
Copyright © 2006-2011 Linux All rights reserved