Operating system process and threads
Central Processing Units (CPUs) function in a fetch – execute cycle. Specifically, the operating system (OS) fetches a set of instructions (program) from disk into memory, and they are then executed by the CPU. A program being executed is called a process. Loading a program into memory to become a process implies dividing memory into these sections:
- Text: This section of the memory allocated typically contains compiled code with a static set of instructions
- Data: Static data and global variables required for a running process
- Heap: Space reserved for dynamically allocated data structures (non-static, non-global variables)
- Stack: Local variables used in functions, which, if large enough, can compete with allocated heap space (causing a ‘stack overflow’ or ‘insufficient heap space’ error)
Although in an asynchronous program it may appear that all instructions in the set are being called at exactly the same time, technically each step is broken into blocks which are scheduled to be executed by the OS. Those blocks are executed so fast that it gives the impression that a processor is computing several things at the same time.
The change from execution of code blocks from one process to execution of blocks from another is a costly operation, called context switching. Context switching involves managing interruptions in the processing of a block, knowing the execution status of any given process, and waiting for other processes to complete, among other requirements for proper process flow.
Introducing threads
Modern computers typically have multiple cores, each of which is capable of executing a process. To better handle context switching, an abstraction was created: a thread, or atomic unit of processing. Each thread runs on a single core, and a processor can simultaneously run multiple threads from a single process by taking advantage of this architecture.
Threads are also called lightweight processes, since they must each individually conform to the structure described above for processes, but there is an important consideration: multiple threads of a process share the memory heap and code/data segments, which means that programmers must be careful to ensure that shared resource constraints are adhered to, but each thread maintains its own private stack.
The following diagram shows how processing can vary according to CPU and OS characteristics:

Figure 1.6: A single process/single thread processor on the left and a multithreaded processor on the right
What happens if a process has more threads than available cores? Thread context switching is ‘lighter’ because it involves saving and restoring less state, while process context switching is ‘heavier’ because it involves saving and restoring more state, including memory mappings. Therefore, in terms of efficiency, context switching between threads is generally faster and less resource-intensive than context switching between processes.
Some pieces of software are multiprocessor but not multithreaded, meaning that all processes are single-threaded (synchronous) but they can be split to take advantage of multiple processors.
There are two types of thread: kernel threads and user threads. User threads are created, managed, and bounded via the Application Programming Interface (API) provided by a system’s OS and managed by the individual program being run. The key point about user threads is that if one of them performs blocking operations, the entire process is blocked. This impacts the way multithreaded programs are designed.
The lifecycle of kernel threads, on the other hand, is entirely managed by the operating system. This type of thread has the advantage that if an operation blocks thread execution, the parent process is not blocked. Python’s default threading model is managed by the underlying operating system kernel, even if by default only one thread can run the interpreter at the time. We will explore this design in more detail in Anchor 4.
Processes, kernel threads and user threads are constructs that involve close management of the physical resources of a computer. As you might expect, modern programming languages provide abstractions to efficiently manage these concepts and the underlying resources. In the next section we will discuss three programming concepts central to multitasking: green threads, coroutines and fibers.