COM 323 Lecture Note
COM 323 Lecture Note
1. It can use mnemonic than numeric operation code and it also provides the information of
any error in the code.
2. This language helps in specifying the symbolic operand that means it does not need to
specify the machine address of that operand. It can be represented in the form of a
symbol.
3. The data can be declared by using decimal notation.
The Differences between assembly language, machine language and high level language
Machine Language
Machine language is the language written as strings of 1’s and 0’s. It is only language which a
computer understands without using a translation program. A machine language instruction has
two parts. The operation code which tells the computer the function to perform and the second
1
COM 323 An Assembly Language| BABA SALEH
part is the operand which tells the computer where to find or store the data which is to be
manipulated.
Disadvantage
It is machine dependent i.e it differs from computer to computer
It is difficult to program and write
It is prone to errors
It is difficult to modify
Assembly language
It is a low level language that allows a user to write a program in using alphanumeric mnemonics
codes, instead of numeric codes for a set of instructions. It requires a translator known as
assembler to convert assembly language into machine language so that it can be understood by
the computer. It is easier to remember and write than machine language.
Advantages
It is easy to understand and use
It is easy to locate and correct errors
It is easier to modify
Disadvantage
It is machine dependent
High Level Language
It is a machine independent language. It enables a user to write programs in a language which
resembles English words and familiar mathematical symbols. COBOL was the first high level
language developed for business. Each high level language is a micro instruction which is
translated into several machine language instructions. Other examples are C,C++,Java, Python
etc.
A Compiler is a translator program which translates a high level programming language into
equivalent machine language programs. It compiles a set of machine language instructions for
every high level language program. The translated code is known as object code.
A Linker: A program used with a compiler to provide links to the libraries needed for
executable program. It takes one or more object program code generated by a compiler and
combines them into a single executable program.
An Interpreter: It is a translator used for translating high level language into the desired output.
It takes one statement and translates it into machine language instructions and then immediately
executes the result. Its output is the result of program execution.
Advantages of HLL
It is machine independent
It is easier to learn and use
It is easier to maintain and gives few errors
Disadvantage
It lowers efficiency
2
COM 323 An Assembly Language| BABA SALEH
It is less flexible
Step 2
The meaning of operation code, operand, instruction and register.
Operation Codes
Numeric codes called operation codes (or opcodes for short) contain the instructions that
represent the actual operation to be performed by the CPU. The operation code of an instruction
is a group of bits that define operations such as add, subtract, multiply, shift and compliment.
The number of bits required for the operation code depends upon the total number of operations
available on the computer. The operation code must consist of at least n bits for a
given 2^n operations. The operation part of an instruction code specifies the operation to be
performed.
Instruction register has a part that contains the numeric op codes. A decoder determines from
the op codes the operation to be executed, and a data register controls the flow of data inside the
CPU as a result of the opcode instructions.
One important function of the opcode decoder is to determine how many bytes must be read to
execute each instruction. Many instructions require two or three bytes. Fig. below shows the
arrangement of the bytes in an instruction. The first byte contains the opcode. The second byte
contains address information, usually the lowest or least significant byte of the address.
3
COM 323 An Assembly Language| BABA SALEH
Step 5 & 6
INSTRUCTION FORMATS
A computer will usually have a variety of instruction code formats. It is the function of the
control unit within the CPU to interpret each instruction code and provide the necessary control
functions needed to process the instruction. The format of an instruction is usually depicted in a
rectangular box symbolizing the bits of the instruction as they appear in memory words or in a
control register. The bits of the instruction are divided into groups called fields. The most
common fields found in instruction formats are:
1 An operation code field that specifies the operation to be performed.
2. An address field that designates a memory address or a processor registers.
3. A mode field that specifies the way the operand or the effective address is determined.
Other special fields are sometimes employed under certain circumstances, as for example a field
that gives the number of shifts in a shift-type instruction. The operation code field of an
instruction is a group of bits that define various processor operations, such as add, subtract,
complement, and shift. The bits that define the mode f+++ield of an instruction code specify a
variety of alternatives for choosing the operands from the given address.
Computers may have instructions of several different lengths containing varying number of
addresses. The number of address fields in the instruction format of a computer depends on the
internal organization of its registers. Most computers fall into one of three types of CPU
organizations:
1 Single accumulator organization.
4
COM 323 An Assembly Language| BABA SALEH
5
COM 323 An Assembly Language| BABA SALEH
computers. This is because the operation is performed on the two items that are on top of the
stack. The instruction
ADD
In a stack computer consists of an operation code only with no address field. This operation has
the effect of popping the two top numbers from the stack, adding the numbers, and pushing the
sum into the stack. There is no need to specify operands with an address field since all operands
are implied to be in the stack. Most computers fall into one of the three types of organizations
that have just been described. Some computers combine features from more than one
organization structure.
Using zero, one, two, or three address instruction. We will use the symbols ADD, SUB, MUL,
and DIV for the four arithmetic operations; MOV for the transfer-type operation; and LOAD and
STORE for transfers to and from memory and AC register. We will assume that the operands are
in memory addresses A, B, C, and D, and the result must be stored in memory at address X.
ZERO-ADDRESS INSTRUCTIONS
A stack-organized computer does not use an address field for the instructions ADD and MUL.
communicates with the stack. The following program shows how X = (A + B) ∗ (C + D) will be
The PUSH and POP instructions, however, need an address field to specify the operand that
written for a stack organized computer. (TOS stands for top of stack)
PUSH A TOS ← A
PUSH B TOS ← B
ADD TOS ← (A + B)
PUSH C TOS ← C
PUSH D TOS ← D
ADD TOS ← (C + D)
MUL TOS ← (C + D) ∗ (A + B)
POP X M [X] ← TOS
To evaluate arithmetic expressions in a stack computer, it is necessary to convert the expression
into reverse Polish notation. The name “zero-address” is given to this type of computer because
of the absence of an address field in the computational instructions.
ONE-ADDRESS INSTRUCTIONS
6
COM 323 An Assembly Language| BABA SALEH
One-address instructions use an implied accumulator (AC) register for all data manipulation. For
multiplication and division there is a need for a second register. However, here we will neglect
X = (A + B) ∗ (C + D) is
the second and assume that the AC contains the result of tall operations. The program to evaluate
LOAD A AC ← M [A]
ADD B AC ← A C+ M [B]
STORE T M [T] ← AC
LOAD C AC ← M [C]
ADD D AC ← AC + M [D]
MUL T AC ← AC ∗ M [T]
STORE X M [X] ← AC
All operations are done between the AC register and a memory operand. T is the address of a
temporary memory location required for storing the intermediate result.
TWO-ADDRESS INSTRUCTIONS
Two address instructions are the most common in commercial computers. Here again each
X = (A + B) ∗ (C + D) is as follows:
address field can specify either a processor register or a memory word. The program to evaluate
(A + B) ∗ (C + D) is shown below, together with comments that explain the register transfer
processor register or a memory operand. The program in assembly language that evaluates X =
7
COM 323 An Assembly Language| BABA SALEH
8
COM 323 An Assembly Language| BABA SALEH
1 Implied Mode: In this mode the operands are specified implicitly in the definition of the
instruction. For example, the instruction “complement accumulator” is an implied-mode
instruction because the operand in the accumulator register is implied in the definition of the
instruction. In fact, all register reference instructions that sue an accumulator are implied-mode
instructions.
2 Immediate Mode: In this mode the operand is specified in the instruction itself. In other
words, an immediate mode instruction has an operand field rather than an address field. The
operand field contains the actual operand to be used in conjunction with the operation specified
in the instruction. Immediate-mode instructions are useful for initializing registers to a constant
value. It was mentioned previously that the address field of an instruction may specify either a
memory word or a processor register. When the address field specifies a processor register, the
instruction is said to be in the register mode.
3 Register Mode: In this mode the operands are in registers that reside within the CPU.
The particular register is selected from a register field in the instruction. A k-bit field can specify
any one of 2k registers.
4 Register Indirect Mode: In this mode the instruction specifies a register in the CPU
whose contents give the address of the operand in memory. In other words, the selected register
contains the address of the operand rather than the Op code Mode Address
5 Auto increment or Auto decrement Mode: This is similar to the register indirect mode
except that the register is incremented or decremented after (or before) its value is used to access
memory. This can be achieved by using the increment or decrement instruction. However, some
computers incorporate a special mode that automatically increments or decrements the content of
the register after data access.
6 Direct Address Mode: In this mode the effective address is equal to the address part of
the instruction. The operand resides in memory and its address is given directly by the address
field of the instruction. In a branch-type instruction the address field specifies the actual branch
address.
7 Indirect Address Mode: In this mode the address field of the instruction gives the
address where the effective address is stored in memory. Control fetches the instruction from
memory and uses its address part to access memory again to read the effective address.
8 Relative Address Mode: In this mode the content of the program counter is added to the
address part of the instruction in order to obtain the effective address. The address part of the
instruction is usually a signed number (in 2’s complement representation) which can be either
positive or negative. When this number is added to the content of the program counter, the result
produces an effective address whose position in memory is relative to the address of the next
instruction. To clarify with an example, assume that the program counter contains the number
825 and the address part of the instruction contains the number 24. The instruction at location
825 is read from memory during the fetch phase and the program counter is then incremented by
one to 826 + 24 = 850. This is 24 memory locations forward from the address of the next
9
COM 323 An Assembly Language| BABA SALEH
instruction. Relative addressing is often used with branch-type instructions when the branch
address is in the area surrounding the instruction word itself.
9 Indexed Addressing Mode: In this mode the content of an index register is added to the
address part of the instruction to obtain the effective address. The index register is a special CPU
register that contains an index value. The address field of the instruction defines the beginning
address of a data array in memory. Each operand in the array is stored in memory relative to the
beginning address. Some computers dedicate one CPU register to function solely as an index
register. In computers with many processor registers, any one of the CPU registers can contain
the index number. In such a case the register must be specified explicitly in a register field within
the instruction format.
10 Base Register Addressing Mode: In this mode the content of a base register is added to
the address part of the instruction to obtain the effective address. This is similar to the indexed
addressing mode except that the register is now called a base register instead of an index register.
The difference between the two modes is in the way they are used rather than in the way that
they are computed. An index register is assumed to hold an index number that is relative to the
address part of the instruction. A base register is assumed to hold a base address and the address
field of the instruction gives a displacement relative to this base address.
Step 7-10
Assembly language instruction set.
subtraction, multiplication and division. (Refer to the previous steps)
AND Instruction
The AND instruction performs a boolean (bitwise) AND operation between each pair of
matching bits in two operands and places the result in the destination operand:
AND destination, source
The following operand combinations are permitted, although immediate operands can be no
larger than 32 bits:
AND reg, reg
AND reg, mem
AND reg, imm
AND mem, reg
AND mem, imm
The operands can be 8, 16, 32, or 64 bits, and they must be the same size. For each matching bit
in the two operands, the following rule applies: If both bits equal 1, the result bit is 1; otherwise,
10
COM 323 An Assembly Language| BABA SALEH
The AND instruction lets you clear 1 or more bits in an operand without affecting other bits. The
technique is called bit masking, much as you might use masking tape when painting a house to
cover areas (such as windows) that should not be painted. Suppose, for example, that a control
byte is about to be copied from the AL register to a hardware device. Further, we will assume
that the device resets itself when bits 0 and 3 are cleared in the control byte. Assuming that we
want to reset the device without modifying any other bits in AL, we can write the following:
and AL,11110110b; clear bits 0 and 3, leave others unchanged
For example, suppose AL is initially set to 10101110 binary. After ANDing it with 11110110,
AL equals 10100110:
mov al, 10101110b
and al, 11110110b; result in AL = 10100110
OR Instruction
The OR instruction performs a boolean OR operation between each pair of matching bits in two
operands and places the result in the destination operand:
OR destination,source
The OR instruction uses the same operand combinations as the AND instruction:
OR reg, reg
OR reg, mem
OR reg, imm
OR mem, reg
OR mem, imm
11
COM 323 An Assembly Language| BABA SALEH
The operands can be 8, 16, 32, or 64 bits, and they must be the same size. For each matching bit
The OR instruction is particularly useful when you need to set 1 or more bits in an operand
without affecting any other bits. Suppose, for example, that your computer is attached to a servo
motor, which is activated by setting bit 2 in its control byte. Assuming that the AL register
contains a control byte in which each bit contains some important information, the following
code only sets the bit in position 2:
or AL,00000100b ; set bit 2, leave others unchanged
For example, if AL is initially equal to 11100011 binary and then we OR it with 00000100, the
result equals 11100111:
mov al, 11100011b
or al, 00000100b ; result in AL = 11100111
XOR Instruction
The XOR instruction performs a boolean exclusive-OR operation between each pair of matching
bits in two operands and stores the result in the destination operand:
XOR destination, source
The XOR instruction uses the same operand combinations and sizes as the AND and OR
instructions. For each matching bit in the two operands, the following applies: If both bits are the
12
COM 323 An Assembly Language| BABA SALEH
Bit-Mapped Sets
Some applications manipulate sets of items selected from a limited-sized universal set. Examples
might be employees within a company, or environmental readings from a weather monitoring
station. In such cases, binary bits can indicate set membership. Rather than holding pointers or
references to objects in a container such as a Java Hash Set, an application can use a bit vector
(or bit map) to map the bits in a binary number to an array of objects. For example, the following
binary number uses bit positions numbered from 0 on the right to 31
on the left to indicate that array elements 0, 1, 2, and 31 are members of the set named SetX:
SetX = 10000000 00000000 00000000 00000111
(The bytes have been separated to improve readability.) We can easily check for set membership
by ANDing a particular member’s bit position with a 1:
mov eax, SetX and eax,10000b; is element [4] a member of SetX?
If the AND instruction in this example clears the Zero flag, we know that element [4] is a
member of SetX.
= (A + B) ∗ (C + D).
addresses with all three specifying processor registers. The following is a program to evaluate X
STORE X, R1 M [X] ← R1
Step 11- 15
Different passes in an assembly process
Assembly and assembler
13
COM 323 An Assembly Language| BABA SALEH
An assembler is a program that converts assembly language into machine code. It takes the
basic commands and operations from assembly code and converts them into binary code that can
be recognized by a specific type of processor. Assemblers are similar to compilers in that they
produce executable code.
Conditional Structures
There are no explicit high-level logic structures in the x86 instruction set, but you can implement
them using a combination of comparisons and jumps. Two steps are involved in executing a
conditional statement: First, an operation such as CMP, AND, or SUB modifies the CPU status
flags. Second, a conditional jump instruction tests the flags and causes a branch to a new address.
Let’s look at a couple of examples.
Example 1 The CMP instruction in the following example compares EAX to Zero. The J Z
(Jump if zero) instruction jumps to label L1 if the Zero flag was set by the CMP instruction:
CMP eax, 0
JZ L1 ; jump if ZF = 1
.
.
L1:
14
COM 323 An Assembly Language| BABA SALEH
Example 2 The AND instruction in the following example performs a bitwise AND on the DL
register, affecting the Zero flag. The JNZ (jump if not Zero) instruction jumps if the Zero flag is
clear:
AND dl, 10110000b
JNZ L2 ; jump if ZF = 0
.
.
L2:
15
COM 323 An Assembly Language| BABA SALEH
An Assembler Directives
16