UNIT-1: Introduction To Computer Organization
UNIT-1: Introduction To Computer Organization
UNIT-1: Introduction To Computer Organization
1. Computer types
A computer can be defined as a fast electronic calculating machine that accepts
the digitized input information, processes it as per the list of internally stored
instructions and produces the resulting output information.
List of instructions are called programs & internal storage is called computer
memory.
2 Functional Units
A computer consists of five functionally independent main parts input, memory,
arithmetic logic unit (ALU), output and control unit.
Input device accepts the coded information as source program i.e. high level
language. This is either stored in the memory or immediately used by the processor to
perform the desired operations. The program stored in the memory determines the
processing steps. Basically the computer converts the source program to an object
Finally the results are sent to the outside world through output device. All of
these actions are coordinated by the control unit.
Input unit: -
The high level language program, coded data is fed to a computer through input
devices. K eyboard is a most common type. Whenever a key is pressed, one corresponding
word or number is translated into its equivalent binary code over a cable & fed either to
memory or processor. Microphone, Joystick, trackball, mouse, scanner etc are other input
devices.
Memory unit: -
Its function is to store programs and data. It is basically of two types
(i) Primary memory: - Is the one exclusively associated with the processor and operates
at the electronics speeds. Programs must be stored in this memory while they are being
executed. The memory contains a large number of semiconductors storage cells, each
capable of storing one bit of information. These are processed in a group of fixed size
called word.
Number of bits in each word is called word length of the computer. Programs
must reside in the memory during execution. Instructions and data can be written into the
memory or read out under the control of processor.
Memory in which any location can be reached in a short and fixed amount of
time after specifying its address is called random-access memory (RAM). The time
required to access one word in called memory access time.
Caches are the small fast RAM units, which are coupled with the processor and
are often contained on the same IC chip to achieve high performance. Although primary
storage is essential it tends to be expensive.
(ii) Secondary memory: - Is used where large amounts of data & programs have to
be stored, particularly information that is accessed infrequently.
Most of the computer operations are executed in ALU of the processor like
addition, subtraction, division, multiplication, etc. the operands are brought into the ALU
from memory and stored in high speed storage elements called registers. Then
according to the instructions the operation is performed in the required sequence.
The control and the ALU are many times faster than other devices connected to a
computer system. This enables a single processor to control a number of external devices
such as key boards, displays, magnetic and optical disks, sensors and other mechanical
controllers.
Output unit:-
This is the counterpart of input unit. Its basic function is to send the processed
results to the outside world.
Examples:- Printer, speakers, monitors are called as I/O units as they provide both an input
and output functionality.
Control unit:-
It effectively is the nerve center that sends signals to other units and senses their
states. The actual timing signals that govern the transfer of data between input unit,
processor, memory and output unit are generated by the control unit.
1. First the instruction is fetched from the memory into the processor.
Computer Organization, Unit-I, VTA Page 4
2. The operand at LOCA is fetched and added to the contents of R0
3. Finally the resulting sum is stored in the register R0
The preceding add instruction combines a memory access operation with an ALU
operations. In some other type of computers, these two types of operations are performed
by separate instructions for performance reasons.
Load LOCA, R1
Add R1, R0
Transfers between the memory and the processor are started by sending the
address of the memory location to be accessed to the memory unit and issuing the
appropriate control signals. The data are then transferred to or from the memory.
The instruction register (IR):- Holds the instruction that is currently being executed.
Its output is available for the control circuits which generates the timing signals that
control the various processing elements in one execution of instruction.
Besides IR and PC, there are n-general purpose registers R0 through Rn-1.
The other two registers which facilitate communication with memory are: -
1. MAR – (Memory Address Register):- It holds the address of the location to be
accessed.
2. MDR – (Memory Data Register):- It contains the data to be written into or read
out of the address location.
An interrupt is a request signal from an I/O device for service by the processor.
The processor provides the requested service by executing an appropriate interrupt
service routine.
The Diversion may change the internal stage of the processor. Its state must
be saved in the memory location before interruption. When the interrupt-routine service
is completed the state of the processor is restored so that the interrupted program
may continue.
4 Bus structure
The simplest and most common way of interconnecting various parts of the
computer is to use a single bus. To achieve a reasonable speed of operation, a computer
must be organized so that all its units can handle one full word of data at a given time.
A group of lines that serve as a connecting port for several devices is called a bus.
In addition to the lines that carry the data, the bus must have lines for address and
control purpose. Simplest way to interconnect is to use the single bus as shown
Since the bus can be used for only one transfer at a time, only two units can
actively use the bus at any given time. Bus control lines are used to arbitrate multiple
requests for use of one bus.
Low cost
Very flexible for attaching peripheral devices
All the interconnected devices are not of same speed & time that leads to a bit of a
problem. This is solved by using cache registers (ie buffer registers). These buffers are
electronic registers of small capacity when compared to the main memory but of
comparable speed.
The instructions from the processor at once are loaded into these buffers and then
the complete transfer of data at a fast rate will take place.
Multi Bus structure
The computer must already contain some system software to enter and run application
programs. System software is a collection of programs that are executed as needed to perform
functions such as,
The operating system is a collection of routines that is used to control the sharing of and
interaction among various computer units as they execute application programs. The OS routines
perform the tasks required to assign computer resources to individual application programs.
These tasks include assigning memory and magnetic disk space to program and data files,
moving data between memory and disk units, and handling I/O operations.
Let us consider a system with one processor, one disk, and one printer. Assume that part
of the program’s task involves reading a data file from the disk into the memory, performing
some computations on the data, and printing the results.
A convenient way to illustrate the sharing of the processor time is by a time-line diagram, as
shown in the following diagram.
The computer resources can be used more efficiently if several application programs are
to be processed. Notice that the disk and the processor are idle during most of the time period t 4
to t5. The operating system can load the next program to be executed into the memory from the
disk while the printer is operating. Similarly, during t 0 to t1, the operating system can arrange to
print the previous program’s results while the current program is being loaded from the disk.
Thus, the operating system manages the concurrent execution of several application
programs to make the best possible use of computer resources. This pattern of concurrent
execution is called multiprogramming or multitasking.
Assembly languages: The languages with the lower level of abstraction are called
assembly languages.
Assembly languages are platform dependent.
Each microprocessor has its own assembly languages instruction set. I.c a program written in
the assembly languages of one microprocessor cannot be run on a computer that has different
microprocessor.
The new version microprocessor is able to execute the assembly language program designed
for previous versions.
Computer Organization, Unit-I, VTA Page 12
Eg: Intel’s Pentium III, Microprocessor can run program written in the assembly languages
of the Pentium II, Pentium pro, and Pentium…..etc.
Machine languages: The languages contain the binary values that cause the
microprocessor to perform certain operations.
When a microprocessor reads and executes an instruction it is a machine languages
instruction.
Programs written in high level assembly level languages are converted into machine level for
executed by the microprocessor.
Machine languages code that is backward compatible.
Translators
Compiler:
A compiler converts high level languages program into a machine level languages program
(byte code). The compiler takes a high level languages program as input and checks the
syntax errors of each statement and generates object code (Machine languages equivalent of
source code).
After compilation a linker combined our object code with other required object codes
and finally gives executable file.
A loader copies this executable file into memory and executed by the microprocessor.
The high level languages are platform independent. So a single compiler that produces
different objects code for different platforms.
A high level language statement is usually converted to a sequence of several machine code
instructions.
The assembler takes an assembly languages program as input and converts into object code.
After it follows the linking and loading procedures for execution.
Some small applications written in assembly languages do not need linking other object files
directly the object file get after assembly can be executed.
Linker
a linker is a computer program that takes one or more object files generated by
a compiler and combines them into one, executable program.
Computer programs are usually made up of multiple modules that span separate object files,
each being a compiled computer program. The program as a whole refers to these separately-
compiled object files using symbols. The linker combines these separate files into a single,
unified program; resolving the symbolic references as it goes along.
Dynamic linking is a similar process, available on many operating systems, which
postpones the resolution of some symbols until the program is executed. When the program
is run, these dynamic link libraries are loaded as well. Dynamic linking does not require a
linker.
The linker bundled with most Linux systems is called ld; see our ld documentation page for
more information.
Computer Organization, Unit-I, VTA Page 14
Loaders
The total time required to execute the program is called elapsed time which is a
measure of the performance of the entire computer system. It is affected by the speed of
the processor, the disk and the printer. The time needed to execute an instruction is
called the processor time.
Just as the elapsed time for the execution of a program depends on all units in a
computer system, the processor time depends on the hardware involved in the execution
of individual machine instructions. This hardware comprises the processor and the
memory which are usually connected by the bus as shown in the fig.
The processor and relatively small cache memory can be fabricated on a single
IC chip. The internal speed of performing the basic steps of instruction processing on
chip is very high and is considerably faster than the speed at which the instruction and
data can be fetched from the main memory. A program will be executed faster if the
movement of instructions and data between the main memory and the processor is
minimized, which is achieved by using the cache.
For example:- Suppose a number of instructions are executed repeatedly over a short
period of time as happens in a program loop. If these instructions are available in the
cache, they can be fetched quickly during the period of repeated use. The same applies to
the data that are used repeatedly.
Processor clock: -
Processor circuits are controlled by a timing signal called clock. The clock
designs the regular time intervals called clock cycles. To execute a machine instruction
the processor divides the action to be performed into a sequence of basic steps that each
step can be completed in one clock cycle. The length P of one clock cycle is an important
parameter that affects the processor performance.
Processor used in today’s personal computer and work station have a clock rates
that range from a few hundred million to over a billion cycles per second.
We now focus our attention on the processor time component of the total elapsed
time. Let ‘T’ be the processor time required to execute a program that has been prepared
in some high-level language.
The compiler generates a machine language object program that corresponds to
the source program. Assume that complete execution of the program requires the execution
of N machine cycle language instructions.
The number N is the actual number of instruction execution and is not
necessarily equal to the number of machine cycle instructions in the object program.
Some instruction may be executed more than once,
which in the case for instructions inside a program loop others may not be
executed all, depending on the input data used.
Suppose that the average number of basic steps needed to execute one machine
cycle instruction is S,
where each basic step is completed in one clock cycle. If clock rate is ‘R’ cycles
per second, the program execution time is given by
N× S
T=
R
this is often referred to as the basic performance equation.
We must emphasize that N, S & R are not independent parameters changing one
may affect another.
Introducing a new feature in the design of a processor will lead to improved
performance only if the overall result is to reduce the value of T.
We assume that instructions are executed one after the other. Hence the value of
S is the total number of basic steps, or clock cycles, required to execute one instruction.
A substantial improvement in performance can be achieved by overlapping the execution
of successive instructions using a technique called pipelining.
Consider Add R1 R2 R3
This adds the contents of R1 & R2 and places the sum into R3.
The contents of R1 & R2 are first transferred to the inputs of ALU. After the
addition operation is performed, the sum is transferred to R3. The processor can read the
next instruction from the memory, while the addition operation is being performed. Then
of that instruction also uses, the ALU, its operand can be transferred to the ALU inputs at
the same time that the add instructions is being transferred to R3.
Computer Organization, Unit-I, VTA Page 18
In the ideal case if all instructions are overlapped to the maximum degree
possible the execution proceeds at the rate of one instruction completed in each clock
cycle. Individual instructions still require several clock cycles to complete. But for the
purpose of computing T, effective value of S is 1.
Clock rate
These are two possibilities for increasing the clock rate ‘R’.
1. Improving the IC technology makes logical circuit faster, which reduces the time
of execution of basic steps. This allows the clock period P, to be reduced and the
clock rate R to be increased.
2. Reducing the amount of processing done in one basic step also makes it possible
to reduce the clock period P. however if the actions that have to be performed by
an instructions remain the same, the number of basic steps needed may increase.
Performance measurements
It is very important to be able to access the performance of a computer, designers
use performance estimates to evaluate the effectiveness of new features.
The performance measure is the time taken by the computer to execute a given
bench mark. Initially some attempts were made to create artificial programs that could be
used as bench mark programs. But synthetic programs do not properly predict the
performance obtained when real application programs are run.
The program selected range from game playing, compiler, and data base
applications to numerically intensive programs in astrophysics and quantum chemistry. In
each case, the program is compiled under test, and the running time on a real computer is
measured. The same program is also compiled and run on one computer selected as
reference.
Means that the computer under test is 50 times as fast as the ultra sparc 10. This
is repeated for all the programs in the SPEC suit, and the geometric mean of the result is
computed.
Let SPECi be the rating for program ‘i’ in the suite. The overall SPEC rating for
the computer is given by
Since actual execution time is measured the SPEC rating is a measure of the combined
effect of all factors affecting performance, including the compiler, the OS, the processor, the
memory of comp being tested.
• All types of data, except binary numbers, are represented in binary-coded form
• A number system of base, or radix, r is a system that uses distinct symbols for r digits
• Numbers are represented by a string of digit symbols
• The string of digits 724.5 represents the quantity
2 1 0 -1
7 x 10 + 2 x 10 + 4 x 10 + 5 x 10
• The string of digits 101101 in the binary number system represents the quantity
5 4 3 2 1 0
1 x 2 + 0 x 2 + 1 x 2 + 1 x 2 + 0 x 2 + 1 x 2 = 45
• (101101)2 = (45)10
• We will also use the octal (radix 8) and hexidecimal (radix 16) number systems
2 1 0 -1
(736.4)8 = 7 x 8 + 3 x 8 + 6 x 8 + 4 x 8 = (478.5)10
1 0
(F3)16 = F x 16 + 3 x 16 = (243)10
• Conversion from decimal to radix r system is carried out by separating the number into
its integer and fraction parts and converting each part separately
Complements
Computer Organization, Unit-I, VTA Page 22
• Complements are used in digital computers for simplifying subtraction and logical
manipulation
• Two types of complements for each base r system: r’s complement and (r – 1)’s
complement
n
• Given a number N in base r having n digits, the (r – 1)’s complement of N is defined as (r
– 1) – N
n
• For decimal, the 9’s complement of N is (10 – 1) – N
• The 9’s complement of 546700 is 999999 – 546700 = 453299
• The 9’s complement of 453299 is 999999 – 453299 = 546700
n
• For binary, the 1’s complement of N is (2 – 1) – N
• The 1’s complement of 1011001 is 1111111 – 1011001 = 0100110
• The 1’s complement is the true complement of the number – just toggle all bits
n
• The r’s complement of an n-digit number N in base r is defined as r – N
• This is the same as adding 1 to the (r – 1)’s complement
• The 10’s complement of 2389 is 7610 + 1 = 7611
• The 2’s complement of 101100 is 010011 + 1 = 010100
• Subtraction of unsigned n-digit numbers: M – N
o Add M to the r’s complement of N – this results in
n n
M + (r – N) = M – N + r
n
o If M ≥ N, the sum will produce an end carry r which is discarded
n
o If M < N, the sum does not produce an end carry and is equal to r – (N – M),
which is the r’s complement of (N – M). To obtain the answer in a familiar form,
take the r’s complement of the sum and place a negative sign in front.
Example: 72532 – 13250 = 59282. The 10’s complement of 13250 is 86750.
M = 72352
10’s comp. of N = +86750
Sum = 159282
Discard end carry = -100000
Answer = 59282
Example for M < N: 13250 – 72532 = -59282
M = 13250
10’s comp. of N = +27468
Sum = 40718
No end carry
Answer = -59282 (10’s comp. of 40718)
Example for X = 1010100 and Y = 1000011
X = 1010100
2’s comp. of Y = +0111101
Sum = 10010001
Discard end carry = -10000000
Answer X – Y = 0010001
Y = 1000011
Number and character operands, as well as instructions, are stored in the memory
of a computer. The memory consists of many millions of storage cells, each of which can
store a bit of information having the value 0 or 1.
Because a single bit represents a very small amount of information, bits are
seldom handled individually.
The usual approach is to deal with them in groups of fixed size. For this
purpose, the memory is organized so that a group of n bits can be stored or retrieved in a
single, basic operation.
Each group of n bits is referred to as a word of information, and n is called
the word length. The memory of a computer can be schematically represented as a
collection of words as shown in figure.
Modern computers have word lengths that typically range from 16 to 64 bits. If
the word length of a computer is 32 bits, a single word can store a 32-bit 2’s complement
number or four ASCII characters, each occupying 8 bits (1 byte) as shown in figure.
We now have three basic information quantities to deal with: the bit, byte and
word. A byte is always 8 bits, but the word length typically ranges from 16 to 64 bits.
The most practical assignment is to have successive addresses refer to successive byte
locations in the memory. This is the assignment used in most modern computers. The
Computer Organization, Unit-I, VTA Page 28
term byte-addressable memory is used for this assignment. Byte locations have addresses
0,1,2, …. Thus, if the word length of the machine is 32 bits, successive words are located
at addresses 0,4,8,…., with each word consisting of four bytes.
There are two ways that byte addresses can be assigned across words, as shown
in fig b. The name big-endian is used when lower byte addresses are used for the more
significant bytes (the leftmost bytes) of the word. The name little-endian is used for the
opposite ordering, where the lower byte addresses are used for the less significant bytes
(the rightmost bytes) of the word.
WORD ALIGNMENT:-
In the case of a 32-bit word length, natural word boundaries occur at addresses 0,
4, 8, …, as shown in above fig. We say that the word locations have aligned addresses .
in general, words are said to be aligned in memory if they begin at a byte address that is a
multiple of the number of bytes in a word. The memory of bytes in a word is a power of
Computer Organization, Unit-I, VTA Page 29
2. Hence, if the word length is 16 (2 bytes), aligned words begin at byte addresses
0,2,4,…, and for a word length of 64 (23 bytes), aligned words begin at bytes addresses
0,8,16 ….
Memory Operations
Both program instructions and data operands are stored in the memory. To
execute an instruction, the processor control circuits must cause the word (or words)
containing the instruction to be transferred from the memory to the processor.
Operands and results must also be moved between the memory and the processor.
Thus, two basic operations involving the memory are needed, namely, Load (or Read or
Fetch) and Store (or Write).
The load operation transfers a copy of the contents of a specific memory location
to the processor.
The memory contents remain unchanged. To start a Load operation, the processor
sends the address of the desired location to the memory and requests that its
contents be read.
The memory reads the data stored at that address and sends them to the processor.
An information item of either one word or one byte can be transferred between
the processor and the memory in a single operation. Actually this transfer in between the
CPU register & main memory.
Example, names for the addresses of memory locations may be LOC, PLACE, A,
VAR2; processor registers names may be R0, R5; and I/O register names may be
DATAIN, OUTSTATUS, and so on.
The contents of a location are denoted by placing square brackets around the
name of the location. Thus, the expression
R1 [LOC]
Means that the contents of memory location LOC are transferred into processor register
R1.
As another example, consider the operation that adds the contents of registers R1
and R2, and then places their sum into register R3. This action is indicated as
R3[R1] + [R2]
This type of notation is known as Register Transfer Notation (RTN). Note that
the right-hand side of an RTN expression always denotes a value, and the left-hand side
is the name of a location where the value is to be places, overwriting the old contents of
that location.
Computer Organization, Unit-I, VTA Page 31
ASSEMBLY LANGUAGE NOTATION:-
Another type of notation to represent machine instructions and programs. For
this, we use an assembly language format. For example, an instruction that causes the
transfer described above, from memory location LOC to processor register R1, is
specified by the statement
Move LOC, R1
The contents of LOC are unchanged by the execution of this instruction, but the
old contents of register R1 are overwritten.
To carry out this action, the contents of memory locations A and B are fetched
from the memory and transferred into the processor where their sum is computed. This
result is then sent back to the memory and stored in location C.
Operands A and B are called the source operands, C is called the destination
operand, and Add is the operation to be performed on the operands. A general instruction
of this type has the format.
Operation Source1, Source 2, Destination
If k bits are needed for specify the memory address of each operand, the encoded
form of the above instruction must contain 3k bits for addressing purposes in addition to
the bits needed to denote the Add operation.
Which performs the operations C [B], leaving the contents of location B unchanged.
Move B,C
Add A,C
Load A
Add B
Store C
Are generalizations of the Load, Store, and Add instructions for the single-accumulator
case, in which register Ri performs the function of the accumulator.
BRANCHING:-
Consider the task of adding a list of n numbers. The program is shown in fig(a).
Instead of using a long list of add instructions, it is possible to place a single add
instruction in a program loop, as shown in fig (b).
The loop is a straight-line sequence of instructions executed as many times
as needed. It starts at location LOOP and ends at the instruction Branch > 0.
During each pass through this loop, the address of the next list entry is
determined, and that entry is fetched and added to R0.
This type of instruction loads a new value into the program counter. As a result,
the processor fetches and executes the instruction at this new address, called the branch
target, instead of the instruction at the location that follows the branch instruction in
the normal way, and the next instruction in sequential address order is fetched and
executed.
Branch > 0 LOOP
CONDITION CODES:-
The processor keeps track of information about the results of various operations
for use by subsequent conditional branch instructions.
This is accomplished by recording the required information in individual bits,
often called condition code flags.
These flags are usually grouped together in a special processor register called
the condition code register or status register.
Individual condition code flags are set to 1 or cleared to 0, depending on the
outcome of the operation performed.
Let us return to fig b. The purpose of the instruction block at LOOP is to add a
different number from the list during each pass through the loop.
Hence, t h e Add instruction in the block must refer to a different address
during each pass. How are the
ADDRESSING MODES:
When a microprocessor accesses memory to either read or write data, it must specify the
memory address it needs to access.
An assembly languages instruction may use one of several addressing modes to generate this
address.
Eg: LD AC---it loads data from memory into the microprocessors AC register.
5: 10 stores into AC
2) Indirect Mode: in this the instruction includes the address of a memory location that
contains the address of an operand.
Eg: LD AC (5) - It first receives the content of location 5 that is 10 the CPU reads
the data from location 10 and loads into accumulator.
5: 10
Ex: 0:LDAC R
R: 5 stores into AC
4) Register Indirect mode: the instruction contains the register that contains the
address of an operand.
Eg: LOAD (R)
Ex: 0:LDAC R
R: 5
5: 10 stores into AC
5) Immediate mode: the actual data is stored in instruction itself.
Ex: LOAD #5 the value 5 is loaded into AC register.
6) Implicit Mode: The instruction implicitly specifies the operand because it always
applies to a specific register.
Ex: CLAC Clear accumulator register.
7) Relative Mode: In this mode, the operand is an object, not the actual address. it is
added to the contents of the CPUS program counter to generate the required address.
The AC contains the address of the next instruction.
Ex: LD AC $5
EX: 0: LDAC $5
1:
5:
EA = effective address
Value = a signed number
ADDITIONAL MODES:
Autoincrement mode – the effective address of the operand is the contents of a register
specified in the instruction. After accessing the operand, the contents of this register are
automatically to point to the next item in a list.
(Ri)+
Autodecrement mode – the contents of a register specified in the instruction are first
automatically decremented and are then used as the effective address of the operand.
-(Ri)
There are two important differences between how a stack and a queue are
implemented.
One end of the stack is fixed (the bottom), while the other end rises and falls
as data are pushed and popped. A single pointer is needed to point to the top of the stack
at any given time.
On the other hand, both ends of a queue move to higher addresses as data
are added at the back and removed from the front.
So two pointers are needed to keep track of the two ends of the queue.
Subroutines
In a given program, it is often necessary to perform a particular subtask many
times on different data-values. Such a subtask is usually called a subroutine.
For example, a subroutine may evaluate the sine function or sort a list of values into
increasing or decreasing order.
After a subroutine has been executed, the calling program must resume
execution, continuing immediately after the instruction that called the subroutine.
The subroutine is said to return to the program that called it by executing a Return
instruction.
The way in which a computer makes it possible to call and return from
subroutines is referred to as its subroutine linkage method.
The simplest subroutine linkage method is to save the return address in a
specific location, which may be a register dedicated to this function.
Such a register is called the link register. When the subroutine completes its task,
the Return instruction returns to the calling program by branching indirectly through the link
register.
sequence. That is, return addresses are generated and used in a last-in-first-out order.
This suggests that the return addresses associated with subroutine calls should be pushed
onto a stack.
A particular register is designated as the stack pointer, SP, to be used in this
operation. The stack pointer points to a stack called the processor stack.
The Call instruction pushes the contents of the PC onto the processor stack and loads
the subroutine address into the PC.
The Return instruction pops the return address from the processor stack into the PC.
PARAMETER PASSING:-
When calling a subroutine, a program must provide to the subroutine the
parameters, that is, the operands or their addresses, to be used in the computation.
Later, the subroutine returns other parameters, in this case, the results of the
computation.
This exchange of information between a calling program and a subroutine is referred
to as parameter passing. Parameter passing may be accomplished in several ways.
The parameters may be placed in registers or in memory locations, where they can be
accessed by the subroutine. Alternatively, the parameters may be placed on the processor
stack used for saving the return address.
9. Additional instructions
LOGIC INSTRUCTIONS:
Logic operations such as AND, OR, and NOT, applied to individual bits, are the basic
building blocks of digital circuits, as described. It is also useful to be able to perform
logic operations is software, which is done using instructions that apply these operations
to all bits of a word or byte independently and in parallel. For example, the instruction
Not dst