Cpython Internals Sample Chapters
Cpython Internals Sample Chapters
Python 3 Interpreter
Anthony Shaw
CPython Internals: Your Guide to the Python 3 Interpreter
Anthony Shaw
For online information and ordering of this and other books by Real
Python, please visit realpython.com. For more information, please
contact us at [email protected].
Thank you for downloading this ebook. This ebook is licensed for
your personal enjoyment only. This ebook may not be re-sold or
given away to other people. If you would like to share this book
with another person, please purchase an additional copy for each
recipient. If you’re reading this book and did not purchase it,
or it was not purchased for your use only, then please return to
realpython.com/cpython-internals and purchase your own copy.
Thank you for respecting the hard work behind this book.
This is a sample from “CPython Internals: Your
Guide to the Python 3 Interpreter”
With this book you’ll cover the critical concepts behind the internals of
CPython and how they work with visual explanations as you go along.
“It’s the book that I wish existed years ago when I started my Python
journey. After reading this book your skills will grow and you will be
able solve even more complex problems that can improve our world.”
Of course, after going over that chapter I couldn’t resist the rest. I am
eagerly looking forward to have my own printed copy once it’s out!
I had gone through your ‘Guide to the CPython Source Code’ article
previously, which got me interested in finding out more about the in-
ternals.
There are a ton of books on Python which teach the language, but I
haven’t really come across anything that would go about explaining
the internals to those curious minded.
Anthony has been programming since the age of 12 and found a love
for Python while trapped inside a hotel in Seattle, Washington, 15
years later. After ditching the other languages he’d learned, Anthony
has been researching, writing about, and creating courses for Python
ever since.
Anthony also contributes to small and large Open Source projects, in-
cluding CPython, as well as being a member of the Apache Software
Foundation.
Contents 7
Foreword 12
Introduction 14
How to Use This Book . . . . . . . . . . . . . . . . . . . . 15
Bonus Material and Learning Resources . . . . . . . . . . 17
7
Contents
Compiling CPython 43
Compiling CPython on macOS . . . . . . . . . . . . . . . 44
Compiling CPython on Linux . . . . . . . . . . . . . . . . 46
Installing a Custom Version . . . . . . . . . . . . . . . . . 48
A Quick Primer on Make . . . . . . . . . . . . . . . . . . 48
CPython’s Make Targets . . . . . . . . . . . . . . . . . . . 50
Compiling CPython on Windows . . . . . . . . . . . . . . 53
Profile-Guided Optimization . . . . . . . . . . . . . . . . 59
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8
Contents
9
Contents
10
Contents
Debugging 330
Using the Crash Handler . . . . . . . . . . . . . . . . . . 331
Compiling Debug Support . . . . . . . . . . . . . . . . . . 331
Using LLDB for macOS . . . . . . . . . . . . . . . . . . . 332
Using GDB . . . . . . . . . . . . . . . . . . . . . . . . . 336
Using Visual Studio Debugger . . . . . . . . . . . . . . . . 339
Using CLion Debugger . . . . . . . . . . . . . . . . . . . 341
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 346
11
Foreword
CPython Internals will take you on a journey to explore the wildly suc-
cessful programming language Python. The book serves as a guide
to how CPython works under the hood. It will give you a glimpse of
how the core developers crafted the language.
12
Contents
Why do I want to share Anthony’s CPython Internals with you? It’s the
book that I wish existed years ago when I started my Python journey.
More importantly, I believe we, as members of the Python community,
have a unique opportunity to put our expertise to work to help solve
the complex real-world problems facing us.
I’m confident that after reading this book, your skills will grow, and
you will be able solve even more complex problems and improve our
world.
It’s my hope that Anthony motivates you to learn more about Python,
inspires you to build innovative things, and gives you confidence to
share your creations with the world.
Warmly,
13
Introduction
Are there certain parts of Python that just seem like magic, like how
finding an item is so much faster with dictionaries than looping over a
list? How does a generator remember the state of variables each time
it yields a value? Why don’t you ever have to allocate memory like you
do with other languages?
The answer is that CPython, the most popular Python runtime, is writ-
ten in human-readable C and Python code.
CPython gives the developer writing Python code the platform to write
scalable and performant applications. At some stage in your progres-
sion as a Python developer, you’ll need to understand how CPython
works. These abstractions aren’t perfect, and they’re leaky.
Once you understand how CPython works, you can fully leverage its
power and optimize your applications. This book will explain the con-
cepts, ideas, and technicalities of CPython.
In this book, you’ll cover the major concepts behind the internals of
CPython and learn how to:
14
How to Use This Book
• Make changes to the Python syntax and compile them into your
version of CPython
• Navigate and comprehend the inner workings of features like lists,
dictionaries, and generators
• Master CPython’s memory management capabilities
• Scale your Python code with parallelism and concurrency
• Modify the core types with new functionality
• Run the test suite
• Profile and benchmark the performance of your Python code and
runtime
• Debug C and Python code like a professional
• Modify or upgrade components of the CPython library to con-
tribute them to future versions
Take your time with each chapter and try out the demos and interac-
tive elements. You’ll feel a sense of achievement as you grasp the core
concepts that will make you a better Python programmer.
For the best results, we recommend that you avoid copying and past-
ing the code examples. The examples in this book took many itera-
tions to get right, and they may also contain bugs.
With enough practice, you’ll master this material—and have fun along
the way!
15
How to Use This Book
In fact, while writing this book, we discovered many lines of code that
were written by Guido van Rossum (the author of Python) and left
untouched since version 1.
Some of the concepts in this book are brand-new. Some are even ex-
perimental. While writing this book, we came across issues in the
source code and bugs in CPython that were later fixed or improved.
That’s part of the wonder of CPython as a flourishing open source
project.
16
Bonus Material and Learning Resources
The skills you’ll learn in this book will help you read and understand
current and future versions of CPython. Change is constant, and ex-
pertise is something you can develop along the way.
Code Samples
The examples and sample configurations throughout this book will
be marked with a header denoting them as part of the cpython-book-
samples folder:
cpython-book-samples 01 example.py
import this
Code Licenses
The example Python scripts associated with this book are licensed un-
der a Creative Commons Public Domain (CC0) License. This means
you’re welcome to use any portion of the code for any purpose in your
own programs.
17
Bonus Material and Learning Resources
Note
The code in this book has been tested with Python 3.9 on Win-
dows 10, macOS 10.15, and Linux.
Formatting Conventions
Code blocks are used to present example code:
18
Bonus Material and Learning Resources
Note
This is a note filled in with placeholder text. The quick brown
fox jumps over the lazy dog. The quick brown Python slithers
over the lazy hog.
Important
Any references to a file within the CPython source code will be shown
like this:
path to file.py
Keyboard commands and shortcuts will be given for both macOS and
Windows:
Ctrl + Space
realpython.com/cpython-internals/feedback
19
Bonus Material and Learning Resources
• realpython.com
• @realpython on Twitter
• The Real Python Newsletter
• The Real Python Podcast
20
Getting the CPython Source
Code
Think about the features you expect from the Python distribution:
21
What’s in the Source Code?
These are all part of the CPython distribution. It includes a lot more
than just a compiler.
In this book, you’ll explore the different parts of the CPython distribu-
tion:
Note
This book targets version 3.9 of the CPython source code.
Important
22
What’s in the Source Code?
Note
If you don’t have Git available, then you can install it from
git-scm.com. Alternatively, you can download a ZIP file of the
CPython source directly from the GitHub website.
Inside the newly downloaded cpython directory, you’ll find the follow-
ing subdirectories:
cpython/
23
Setting Up Your
Development Environment
Throughout this book, you’ll be working with both C and Python code.
It’s essential that you have your development environment configured
to support both languages.
The CPython source code is about 65 percent Python (of which the
tests are a significant part) and 24 percent C. The remainder is a mix
of other languages.
IDE or Editor?
If you haven’t yet decided which development environment to use,
then there’s one decision to make first: whether to use an integrated
development environment (IDE) or a code editor.
24
IDE or Editor?
IDEs also take longer to start up. If you want to edit a file quickly, then
a code editor is a better choice.
There are hundreds of editors and IDEs available for free or at a cost.
Here are some commonly used IDEs and editors suitable for CPython
development:
In the sections below, you’ll explore the setup steps for the following
editors and IDEs:
Skip ahead to the section for your chosen application, or read all of
them if you want to compare.
25
Setting Up Visual Studio
Note
None of the paid features of Visual Studio are required for com-
piling CPython or completing this book. You can use the free
Community edition.
Visual Studio is available for free from Microsoft’s Visual Studio web-
site.
You can deselect Python 3 64-bit (3.7.2) if you already have Python
3.7 installed. You can also deselect any other optional features if you
want to conserve disk space.
The installer will then download and install all the required compo-
nents. The installation can take up to an hour, so you may want to
read on and come back to this section when it finishes.
26
Setting Up Visual Studio
Visual Studio will then download a copy of CPython from GitHub us-
ing the version of Git bundled with Visual Studio. This step also saves
you the hassle of having to install Git on Windows. The download may
take up to ten minutes.
Important
Once the project has downloaded, you need to point Visual Studio to
the PCBuild pcbuild.sln solution file by clicking Solutions and Projects
pcbuild.sln :
27
Setting Up Visual Studio Code
Now that you have Visual Studio configured and the source code
downloaded, you can compile CPython on Windows by following the
steps in the next chapter.
Installing
Visual Studio Code, sometimes known as VS Code, is available with a
simple installer at code.visualstudio.com.
Out of the box, VS Code has the necessary code editing capabilities,
but it becomes more powerful once you install extensions.
28
Setting Up Visual Studio Code
Inside the Extensions panel, you can search for extensions by name
or by their unique identifier, such as ms-vscode.cpptools. In some cases
there are many plugins with similar names, so use the unique identi-
fier to be sure you’re installing the right one.
29
Setting Up Visual Studio Code
After you install these extensions, you’ll need to reload the editor.
Many of the tasks in this book require a command line. You can add an
integrated terminal into VS Code by selecting Terminal New Terminal .
A terminal will appear below the code editor:
30
Setting Up Visual Studio Code
If you click on or hover over a C macro, then the editor will expand
that macro to the compiled code:
Create a tasks.json file inside the .vscode directory if one doesn’t al-
ready exist. This tasks.json file will get you started:
cpython-book-samples 11 tasks.json
31
Setting Up Visual Studio Code
{
"version": "2.0.0",
"tasks": [
{
"label": "build",
"type": "shell",
"group": {
"kind": "build",
"isDefault": true
},
"windows": {
"command": "PCBuild/build.bat",
"args": ["-p", "x64", "-c", "Debug"]
},
"linux": {
"command": "make -j2 -s"
},
"osx": {
"command": "make -j2 -s"
}
}
]
}
Using the Task Explorer plugin, you’ll see a list of your configured
tasks inside the vscode group:
32
Setting Up JetBrains CLion
In the next chapter, you’ll learn more about the build process for com-
piling CPython.
CPython has both C and Python code. You can’t install C/C++ support
into PyCharm, but CLion comes bundled with Python support.
Important
Important
After compiling CPython for the first time, you’ll have a makefile in
the root of the source directory.
Open CLion and choose Open or Import from the welcome screen.
Navigate to the source directory, select the makefile, and press Open :
33
Setting Up JetBrains CLion
CLion will ask whether you want to open the directory or import
the makefile as a new project. Select Open as Project to import as a
project.
CLion will ask which make target to run before importing. Leave the
default option, clean, and continue:
Next, check that you can build the CPython executable from CLion.
From the top menu, select Build Build Project .
In the status bar, you should see a progress indicator for the project
build:
34
Setting Up JetBrains CLion
Once this task is complete, you can target the compiled binary as a
run/debug configuration.
Click OK to add this configuration. You can repeat this step as many
times as you like for any of the CPython make targets. See the section
35
Setting Up JetBrains CLion
The cpython build configuration will now be available in the top right
of the CLion window:
To test it out, click the arrow icon or select Run Run ’cpython’ from
the top menu. You should now see the REPL at the bottom of the
CLion window:
Great! Now you can make changes and quickly try them out by click-
ing Build and Run . If you put any breakpoints in the C code, then
make sure you choose Debug instead of Run .
36
Setting up Vim
Within the code editor, the shortcuts Cmd + Click on macOS and Ctrl
+ Click on Windows and Linux will bring up in-editor navigation fea-
tures:
Setting up Vim
Vim is a powerful console-based text editor. For fast development,
use Vim with your hands resting on the keyboard home keys. The
shortcuts and commands are within reach.
Note
On most Linux distributions and within the macOS Terminal,
vi is an alias for vim. We’ll use the vim command in this book,
but if you have the alias, then vi will also work.
Out of the box, Vim has only basic functionality, little more than a text
editor like Notepad. With some configuration and extensions, how-
ever, Vim can become a powerful tool for both Python and C editing.
37
Setting up Vim
1. Fugitive: A status bar for Git with shortcuts for many Git tasks
2. Tagbar: A pane for making it easier to jump to functions, meth-
ods, and classes
To install these plugins, first change the contents of your Vim config-
uration file (normally HOME .vimrc) to include the following lines:
cpython-book-samples 11 .vimrc
syntax on
set nocompatible " be iMproved, required
filetype off " required
38
Setting up Vim
You should see output for the download and installation of the plugins
specified in the configuration file.
When editing or exploring the CPython source code, you will want to
jump quickly between methods, functions, and macros. A basic text
search won’t distinguish a call to a function or its definition from the
implementation. But you can use an application called ctags to index
source files across a multitude of languages into a plain text database.
To index CPython’s headers for all the C files and Python files in the
standard library, run the following code:
$./configure
$ make tags
$ vim Python/ceval.c
39
Setting up Vim
You’ll see the Git status at the bottom and the functions, macros, and
variables in the right-hand pane:
$ vim Lib/subprocess.py
40
Conclusion
Within Vim, you can switch between windows with Ctrl + W , move
to the right-hand pane with L , and use the arrow keys to move up
and down between the tagged functions.
See Also
Check out VIM Adventures for a fun way to learn and memorize
the Vim commands.
Conclusion
If you’re still undecided about which environment to use, then you
don’t need to make a decision right away. We used multiple environ-
ments while writing this book and working on changes to CPython.
41
Conclusion
42
Compiling CPython
In the previous chapter, you saw how to set up your development en-
vironment with an option to run the build stage, which recompiles
CPython. Before the build steps will work, you need a C compiler and
some build tools.
The tools used depend on the operating system you’re using, so skip
ahead to the section for your operating system.
Note
If you’re concerned that any of these steps will interfere with
your existing CPython installations, don’t worry. The CPython
source directory behaves like a virtual environment.
43
Compiling CPython on macOS
Note
Within the terminal, install the C compiler and tool kit by running the
following:
$ xcode-select --install
You’ll also need a working copy of OpenSSL to use for fetching pack-
ages from the PyPI website. If you plan on using this build to install
additional packages, then SSL validation is required.
Note
If you don’t have Homebrew, then you can download and install
it directly from GitHub with the following command:
44
Compiling CPython on macOS
Once you have Homebrew installed, you can install the dependencies
for CPython with the brew install command:
Now that you have the dependencies, you can run the configure script.
The Homebrew command brew --prefix <package> will give the direc-
tory where <package> is installed. You will enable support for SSL by
compiling the location that Homebrew uses.
The flag --with-pydebug enables debug hooks. Add this flag if you in-
tend on debugging for development or testing purposes. Debugging
CPython is covered extensively in the “Debugging” chapter.
The configuration stage needs to be run only once, with the location
of the zlib package specified:
You can now build the CPython binary by running the following com-
mand:
$ make -j2 -s
See Also
For more information on the options for make, see the section “A
Quick Primer on Make.”
During the build, you may receive some errors. In the build summary,
make will notify you that not all packages were built. For example, the
ossaudiodev, spwd, and _tkinter packages will fail to build with this set of
45
Compiling CPython on Linux
The build will take a few minutes and generate a binary called
python.exe. Every time you make changes to the source code, you’ll
need to rerun make with the same flags.
$ ./python.exe
Python 3.9 (tags/v3.9:9cf67522, Oct 5 2020, 10:00:00)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
Important
Yes, that’s right, the macOS build has a .exe file extension. This
extension is not because it’s a Windows binary!
If you later run make install or make altinstall, then the file will
be renamed python before it’s installed onto your system.
Use this command for Fedora Core, RHEL, CentOS, or other YUM-
based systems:
46
Compiling CPython on Linux
Use this command for Fedora Core, RHEL, CentOS or other YUM-
based systems:
Now that you have the dependencies, you can run the configure script,
optionally enabling the debug hooks using --with-pydebug:
$ ./configure --with-pydebug
Next, you can build the CPython binary by running the generated
makefile:
$ make -j2 -s
See Also
For more help on the options for make, see the section “A Quick
Primer on Make.”
Review the output to ensure that there were no issues compiling the
module. If there were, then check with your distribution for in-
structions on installing the headers for OpenSSL.
_ssl
During the build, you may receive some errors. In the build summary,
make will notify you that not all packages were built. That’s okay if you
47
Installing a Custom Version
The build will take a few minutes and generate a binary called python.
This is the debug binary of CPython. Execute ./python to see a working
REPL:
$ ./python
Python 3.9 (tags/v3.9:9cf67522, Oct 5 2020, 10:00:00)
[Clang 10.0.1 (clang-1001.0.46.4)] on Linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
For macOS and Linux, use the altinstall command, which doesn’t
create symbolic links for python3 and installs a standalone version:
$ make altinstall
For Windows, you have to change the build configuration from De-
bug to Release, then copy the packaged binaries to a directory on your
computer that is part of the system path.
For C, C++, and other compiled languages, the list of commands you
need to execute to load, link, and compile your code in the right order
can be very long. When compiling applications from source, you need
to link any external libraries in the system.
48
A Quick Primer on Make
When you executed ./configure, autoconf searched your system for the
libraries that CPython requires and copied their paths into a makefile.
Take the docclean target as an example. This target deletes some gen-
erated documentation files using the rm command:
docclean:
-rm -rf Doc/build
-rm -rf Doc/tools/sphinx Doc/tools/pygments Doc/tools/docutils
If you call make without specifying a target, then make will run the de-
fault target, which is the first target specified in the makefile. For
CPython, this is the all target, which compiles all parts of CPython.
has many options. Here are some you’ll find useful throughout
this book:
make
Option Use
49
CPython’s Make Targets
Option Use
In the next section and throughout the book, you’ll run make with these
options:
The -j2 flag allows make to run two jobs simultaneously. If you have
four or more cores, then you can change this to four or higher and the
compilation will complete faster.
The -s flag stops the makefile from printing every command it runs to
the console. If you want to see what’s happening, then remove the -s
flag.
Build Targets
The following targets are used for building the CPython binary:
Target Purpose
optimization
profile-opt
50
CPython’s Make Targets
Test Targets
The following targets are used for testing your compiled binary:
Target Purpose
Cleaning Targets
The primary cleaning targets are clean, clobber, and distclean. The
clean target is for generally removing compiled and cached libraries
and .pyc files.
If you find that clean doesn’t do the job, then try clobber. The clob-
ber target will remove your makefile, so you’ll have to run ./configure
again.
The following list includes the three primary targets listed above, as
well as some additional cleaning targets:
Target Purpose
jobs
cleantest
51
CPython’s Make Targets
Target Purpose
Installation Targets
There are two flavors of installation targets: the default version, such
as install, and the alt version, such as altinstall. If you want to in-
stall the compiled version onto your computer but don’t want it to
become the default Python 3 installation, then use the alt version of
the commands:
Target Purpose
After you install with make install, the command python3 will link
to your compiled binary. If you use make altinstall, however, only
python$(VERSION) will be installed, and the existing link for python3 will
remain intact.
Miscellaneous Targets
Below are some additional make targets that you may find useful:
52
Compiling CPython on Windows
Target Purpose
PEP 7)
smelly
1. Compile from the command prompt. This still requires the Mi-
crosoft Visual C++ compiler, which comes with Visual Studio.
2. Open PCbuild pcbuild.sln from Visual Studio and build directly.
Inside the PCbuild folder is a .bat file that automates this process for
you. Open a command prompt window inside PCbuild and execute
PCbuild get_externals.bat:
> get_externals.bat
Using py -3.7 (found 3.7 with py.exe)
Fetching external libraries...
Fetching bzip2-1.0.6...
Fetching sqlite-3.28.0.0...
Fetching xz-5.2.2...
Fetching zlib-1.2.11...
53
Compiling CPython on Windows
Now you can compile from either the command prompt or Visual Stu-
dio.
If you do any debugging, then the debug build comes with the ability
to attach breakpoints in the source code. To enable the debug build,
you add -c Debug to specify the debug configuration.
> amd64\python_d.exe
54
Compiling CPython on Windows
Note
The suffix _d specifies that CPython was built in the debug con-
figuration.
Arguments
Flags
Here are some optional flags you can use for build.bat:
Flag Purpose
the language)
--regen
55
Compiling CPython on Windows
When the solution file is loaded, it will prompt you to retarget the
projects inside the solution to the version of the C/C++ compiler that
you have installed. Visual Studio will also target the release of the
Windows SDK that you have installed.
The build stage could take ten minutes or more the first time. Once
the build completes, you may see a few warnings that you can ignore.
56
Compiling CPython on Windows
You can run the release build by changing the build configuration
from Debug to Release on the top menu bar and rerunning Build
Build Solution . You now have both debug and release versions of the
CPython binary within PCbuild amd64.
You’ll most likely want to use the debug binary as it comes with debug-
ging support in Visual Studio and will be useful as you read through
this book.
In the Add Environment window, target the python_d.exe file as the in-
terpreter inside PCbuild amd64 and the pythonw_d.exe as the windowed
interpreter:
57
Compiling CPython on Windows
Throughout this book, there will be REPL sessions with example com-
mands. I encourage you to use the debug binary to run these REPL
sessions in case you want to put in any breakpoints within the code.
58
Profile-Guided Optimization
To make it easier to navigate the code, in the Solution view, click the
toggle button next to the Home icon to switch to Folder view:
For CPython, the profiling stage runs python -m test --pgo, which ex-
ecutes the regression tests specified in Lib test libregrtest pgo.py.
These tests have been specifically selected because they use a com-
monly used C extension module or type.
59
Profile-Guided Optimization
Note
The PGO process is time-consuming, so to keep your compila-
tion time short, I’ve excluded it from the lists of recommended
steps offered throughout this book.
60
Conclusion
Conclusion
In this chapter, you’ve seen how to compile CPython source code into
a working interpreter. You’ll use this knowledge throughout the book
as you explore and adapt the source code.
You might need to repeat the compilation steps dozens or even hun-
dreds of times when working with CPython. If you can adapt your
development environment to create shortcuts for recompilation, then
it’s better to do that now and save yourself a lot of time.
61
The Python Language and
Grammar
Some compilers will compile into a low-level machine code that can
be executed directly on a system. Other compilers will compile into
an intermediary language to be executed by a virtual machine.
Python code isn’t compiled into machine code. It’s compiled into a
low-level intermediary language called bytecode. This bytecode is
stored in .pyc files and cached for execution. If you run the same
62
Why CPython Is Written in C and Not Python
The answer is based on how compilers work. There are two types of
compilers:
There are also tools available that can take a language specification
and create a parser, which you’ll learn about later in this chapter. Pop-
ular compiler-compilers include GNU Bison, Yacc, and ANTLR.
63
Why CPython Is Written in C and Not Python
See Also
If you want to learn more about parsers, then check out the Lark
project. Lark is a parser for context-free grammar written in
Python.
CPython, on the other hand, kept its C heritage. Many of the standard
library modules, like the ssl module or the sockets module, are written
in C to access low-level operating system APIs.
The APIs in the Windows and Linux kernels for creating network sock-
ets, working with the file system, or interacting with the display were
all written in C, so it made sense for Python’s extensibility layer to be
focused on the C language. Later in this book, you’ll cover the Python
standard library and the C modules.
The compiler needs strict rules for the grammatical structure for the
language before it tries to execute it.
64
The Python Language Specification
Note
For the rest of this book, ./python will refer to the compiled ver-
sion of CPython. However, the actual command will depend on
your operating system.
For Windows:
> python.exe
For Linux:
$ ./python
For macOS:
$ ./python.exe
Language Documentation
The Doc reference directory contains reStructuredText explanations
of the features in the Python language. These files form the official
Python reference guide at docs.python.org/3/reference.
Inside the directory are the files you need to understand the whole
language, structure, and keywords:
65
The Python Language Specification
cpython/Doc/reference
compound_stmts.rst Compound statements like if, while, for, and function definitions
datamodel.rst Objects, values, and types
executionmodel.rst The structure of Python programs
expressions.rst The elements of Python expressions
grammar.rst Python’s core grammar (referencing Grammar/Grammar)
import.rst The import system
index.rst Index for the language reference
introduction.rst Introduction to the reference documentation
lexical_analysis.rst Lexical structure like lines, indentation, tokens, and keywords
simple_stmts.rst Simple statements like assert, import, return, and yield
toplevel_components.rst Description of the ways to execute Python, like scripts and modules
An Example
The with statement has many forms, the simplest being the instantia-
tion of a context manager and a nested block of code:
with x():
...
with x() as y:
...
66
The Python Language Specification
• * for repetition
• + for at-least-once repetition
• [] for optional parts
• | for alternatives
• () for grouping
See Also
In CPython 3.9, the CPython source code has two grammar files.
One legacy grammar is written in a context-free notation called
Backus-Naur Form (BNF). In CPython 3.10, the BNF grammar
file (Grammar Grammar) has been removed.
67
The Python Language Specification
< skimmed
soy
There are a few forms of the while statement. The simplest contains
an expression, then the : terminal followed by a block of code:
68
The Python Language Specification
If you search for while_stmt in the grammar file, then you can see the
definition:
while_stmt[stmt_ty]:
| 'while' a=named_expression ':' b=block c=[else_block] ...
try_stmt[stmt_ty]:
| 'try' ':' b=block f=finally_block { _Py_Try(b, NULL, NULL, f, EXTRA) }
| 'try' ':' b=block ex=except_block+ el=[else_block] f=[finally_block]..
except_block[excepthandler_ty]:
| 'except' e=expression t=['as' z=target { z }] ':' b=block {
_Py_ExceptHandler(e, (t) ? ((expr_ty) t)->v.Name.id : NULL, b, ...
| 'except' ':' b=block { _Py_ExceptHandler(NULL, NULL, b, EXTRA) }
finally_block[asdl_seq*]: 'finally' ':' a=block { a }
69
The Parser Generator
else : block
finally : block
finally : block
The CPython parser was rewritten in Python 3.9 from a parser table
automaton (the pgen module) into a contextual grammar parser.
In Python 3.9, the old parser is available at the command line by using
the -X oldparser flag, and in Python 3.10 it’s removed completely. This
book refers to the new parser implemented in 3.9.
Regenerating Grammar
To see pegen, the new PEG generator introduced in CPython 3.9, in
action, you can change part of the Python grammar. Search Grammar
python.gram for small_stmt to see the definition of small statements:
70
Regenerating Grammar
small_stmt[stmt_ty] (memo):
| assignment
| e=star_expressions { _Py_Expr(e, EXTRA) }
| &'return' return_stmt
| &('import' | 'from') import_stmt
| &'raise' raise_stmt
| 'pass' { _Py_Pass(EXTRA) }
| &'del' del_stmt
| &'yield' yield_stmt
| &'assert' assert_stmt
| 'break' { _Py_Break(EXTRA) }
| 'continue' { _Py_Continue(EXTRA) }
| &'global' global_stmt
| &'nonlocal' nonlocal_stmt
pass
| ('pass'|'proceed') { _Py_Pass(EXTRA) }
pass
proceed
Next, rebuild the grammar files. CPython comes with scripts to auto-
mate grammar regeneration.
71
Regenerating Grammar
$ make regen-pegen
If the code compiled successfully, then you can execute your new
CPython binary and start a REPL.
In the REPL, you can now try defining a function. Instead of using the
pass statement, use the proceed keyword alternative that you compiled
into the Python grammar:
$ ./python
72
Regenerating Grammar
Tokens
Alongside the grammar file in the Grammar folder is the Grammar Tokens
file, which contains each of the unique types found as leaf nodes in a
parse tree. Each token also has a name and a generated unique ID.
The names make it simpler to refer to tokens in the tokenizer.
Note
The Grammar Tokens file is a new feature in Python 3.8.
For example, the left parenthesis is called LPAR, and semicolons are
called SEMI. You’ll see these tokens later in the book:
LPAR '('
RPAR ')'
LSQB '['
RSQB ']'
COLON ':'
COMMA ','
SEMI ';'
As with the Grammar file, if you change the Grammar file, you need
to rerun pegen.
Tokens
To see tokens in action, you can use the tokenize module in CPython.
Note
The tokenizer written in Python is a utility module. The actual
Python parser uses a different process for identifying tokens.
cpython-book-samples 13 test_tokens.py
# Demo application
def my_function():
proceed
73
Regenerating Grammar
Input the test_tokens.py file to a module built into the standard library
called tokenize. You’ll see the list of tokens by line and character. Use
the -e flag to output the exact token names:
In the output, the first column is the range of the line and column
coordinates, the second column is the name of the token, and the final
column is the value of the token.
It’s best practice to have a blank line at the end of your Python source
files. If you omit it, then CPython adds one for you.
74
Conclusion
To see a verbose readout of the C parser, you can run a debug build
of Python with the -d flag. Using the test_tokens.py script you created
earlier, run it with the following:
$ ./python -d test_tokens.py
Conclusion
In this chapter, you’ve been introduced to the Python grammar defini-
tions and parser generator. In the next chapter, you’ll expand on that
knowledge to build a more complex syntax feature, an “almost-equal”
75
This is a sample from “CPython Internals: Your
Guide to the Python 3 Interpreter”
With this book you’ll cover the critical concepts behind the internals of
CPython and how they work with visual explanations as you go along.