100% found this document useful (1 vote)
76 views51 pages

Assembly - Fundamentals of ALP

Uploaded by

haileyesusasrat9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
76 views51 pages

Assembly - Fundamentals of ALP

Uploaded by

haileyesusasrat9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

Chapter One

Basics of Assembly Language Programs

Machine Organization and Assembly


Language programming
Introduction to assembly language
 What is Assembly Language?

Assembly language is a low-level language.

Assembly language provides directly what the processor provides


you.

The processor does not provide such function like Cin and Cout.
Cont’d
 So how do we communicate with the user?

It is simple we can use system call (or interrupt calls in DOS).

The only solution is code them explicitly in our programs.

Mov

al,2h

Mov

dl,’a’

Int 21

<---------
coding
interrupt
explicitl
Assembly language structure/components

1) Label: symbolic name for memory address


2) Operation code: the name of instruction to be executed
3) Operand: consists of additional information or data the
OPCODE requires
4) Comment: provide a space for documentation to explain what has
been done for the purpose of debugging and maintenance
Cont’d

 At one time many assembly language mnemonics were three letter


abbreviations, such as JMP for jump, INC for increment, etc.
 Modern processors have a much larger instruction set and many mnemonics
are now longer, for example FPATAN for "floating point particular tangent”
and BOUND for "check array index against bounds"
Cont’d
 In some assembly languages the same mnemonic such as MOV may refer to
a family of related OPCODES for loading, copying and moving
data, whether these are immediate values, values in registers, or
memory locations pointed to by values in registers.
 Other assemblers may use separate OPCODES such as L for "move
memory to register", ST for "move register to memory", LR for "move
register to register", MVI for "move immediate operand to memory", etc.
Comparison of assembly and high-level languages
 Assembly language is low level and High-level language is high level.
 Assembly language can use registers and main memory and HLL
uses
main memory.
 AL is machine oriented and HLL is human oriented.
 Assembly languages are close to a one to one correspondence between
symbolic instructions and executable machine codes.
 Assembly languages also include directives to the assembler, directives
to the linker, directives for organizing data space, and macros.
Applications of Assembly language
 Hard-coded assembly language is typically used in a system's boot
ROM (BIOS on IBM-compatible PC systems).
 A stand-alone binary executable of compact size is required, i.e. one that
must execute without recourse to the run-time components or
libraries associated with a high-level language; this is perhaps the most
common situation.
 Assembly language is also valuable in reverse engineering, since many
programs are distributed only in machine code form, and machine code is
usually easy to translate into assembly language and carefully examine in
this form, but very difficult to translate into a higher-level language.
 In time critical application a specified time period (real time application)
a) Aircraft navigation systems
b) Process control systems
c) Robot control software
d) Communication software
e) Target acquis ion (Missile tracking) software
Cont’d
 System software often requires direct control over the system
hardware, So, it needs assembly program to communicate with the
hardware. Such system software
a) Operating system (low level part of
operating system)
b) Assembler and compiler
c) Linkers and loaders
d) Device driver and network interface
What is wrong with Assembly Language?
 Here are the reasons people give for not using assembly:
Assembly is hard to learn.
Assembly language programming is time consuming and not portable.
 Improved compiler technology has eliminated the need for assembly
language.
 Today, machines are so fast and have more memory that we no longer
need to use assembly.
 If you need more speed, you should use a better algorithm rather than
switch to assembly language.
 Assembly Language is
Hard to learn
Hard to read and understand
Hard to debug
Hard to maintain
Hard to write
Time consuming and not portable
What is right with Assembly language?
 Assembly language has several benefits:
Speed. Assembly language programs are generally the fastest programs
around.
Space. Assembly language programs are often the smallest.
Capability.You can do things in assembly which are difficult or
impossible in HLLs.
Knowledge.Your knowledge of assembly language will help you
write better programs, even when using HLLs.
Hierarchy of Languages
Compilers and Assemblers
Tools of Assembly language programming
 Assembly language programming tools are available as free software
packages under the GNU licensing agreement and as commercial
software available for purchase.
 To program in assembly, you will need some software, namely an
assembler and an editor.
Assemblers
1) MASM –Originally by Microsoft, it's now included in the
MASM32v8
package, which includes other tools as well.
2) TASM – Another popular assembler.Made by Borland but is
3) still
NASMa –A free, open assembler, which is also available for
commercial product, so you cannot
source get it for free.
other
4) platforms.
Emulator Editor:-
 Notepad/ notepad ++
software[EMU8086]
 Textpad,VS code, Jedit
Basic Elements of Assembly Language
Integer Constants
 An integer constant (or integer literal) is made up of an optional leading
sign, one or more digits, and an optional suffix character (called a
radix) indicating the number’s base: [{+ | −}] digits [ radix]
 Radix may be one of the following (uppercase or lowercase):

 If no radix is given, the integer constant is assumed to be decimal. Here are


some examples using different radixes:
Cont’d

Example:-

 A hexadecimal constant beginning with a letter must have a leading zero


to
prevent the assembler from interpreting it as an identifier.
 Integer Expressions:-is a mathematical expression
involving integer values and arithmetic operators.
 It can store 32 bits (00000000h through FFFFFFFFh).
 The arithmetic operators are listed in the following table according to
their precedence order, from highest (1) to lowest (4).
Cont’
d

 Precedence refers to the implied order of operations when an


expression contains two or more operators.
Cont’d
 Real number constants:- are Character Constants:
represented as decimal reals or  Single character w/c is enclosed in
encoded (hexadecimal) reals. single
 It contains an optional sign followed or double quotes.
by an integer, a decimal point, an  Example: ‘A’ ,”d”
optional integer that expresses a String Constants:
fraction, and an optional exponent:
 Is a sequence of characters
 [sign] integer.[ integer] (including spaces) enclosed in single
[ exponent] or double quotes.
sign {+,-}  Examples:
exponent E[{+,-}]  'ABC'
integer  'X'
+3.0  "Good night, Gracie"
-44.2E+05  '4096’
26.E5  Embedded quotes:
 At least one digit and a  "This isn't a test"
decimal  ‘Say "Good night," Helen’
point are required.
Reserved
Words
 have special meaning in Assembler and can only be used in their
correct context.
 There are different types of reserved words:
Instruction mnemonics, such as MOV, ADD, MUL ,INC, JMP, etc
Register names, such as AX,BX,CX,SI,DI,
 Directives, which tell assembler how to assemble programs such as
.code, .data, .stack
Attributes, which provide size and usage information for variables and
operands. Examples are BYTE and WORD
Operators, used in constant expressions such as + / - *
Identifiers
 It is a programmer-chosen name. It might identify a variable, a
constant, a procedure, or a code label.
 Keep the following in mind when creating identifiers:
 They may contain between 1 and 247 characters.
 They are not case sensitive.
 The first character must be a letter (A..Z, a..z), underscore
(_), @ , ?,
or $. Subsequent characters may also be digits.
 An identifier cannot be the same as an assembler reserved
word.
 The @ symbol is used extensively by the assembler as a prefix for
predefined symbols, so avoid it in your own identifiers. Make
identifier names descriptive and easy to understand. Here are
some valid identifiers:
Directives
 Is a command embedded in the source code that is recognized and acted upon
by
the assembler.
 Do not execute at runtime.
 Can define variables, macros, and procedures.
 They can assign names to memory segments and perform many
other
housekeeping tasks related to the assembler.
 Directives are case insensitive. For example, it recognizes .data, .DATA, and
.Data as equivalent.
 The following example helps to show the difference between directives and
instructions.
 The DWORD directive tells the assembler to reserve space in the program for
a doubleword variable.
 The MOV instruction, on the other hand, executes at runtime,
copying the contents of myVar to the EAX register:
myVar DWORD 26 ; DWORD directive
Cont’
dDefining Segments:

One important function of assembler directives is to define program
sections, or segments.
The .DATA directive identifies the area of a program containing variables: .data
The .CODE directive identifies the area of a program containing executable
instructions: .code
The .STACK directive identifies the area of a program holding the runtime
stack, setting its size: .stack 100h
 Instructions :-is a statement that becomes executable when a program is
assembled. It is translated by the assembler into machine language bytes, which
are loaded and executed by the CPU at runtime. An instruction contains
four basic parts:
[ label:] mnemonic [ operands] [; comment]
Cont’
dLabel:-is an identifier that acts as a place marker for instructions and data.
 A label placed just before an instruction implies the instruction’s
address. Similarly, a label placed just before a variable implies the
variable’s address.
 Data Labels: A data label identifies the location of a variable, providing
a convenient way to reference the variable in code. Example;
Count DWORD 100
array DWORD 1024, 2048, 4096, 8192 ; It is possible to define
multiple data items following a label.
 Code Labels: A label in the code area of a program (where instructions
are located) must end with a colon (:).
 Are used as targets of jumping and looping instructions.
Cont’
dFor example, the following JMP (jump) instruction transfers control to the
location marked by the label named target, creating a loop:
target:
mov ax,bx
 ...
jmp target
 A code label can share the same line with an instruction, or it can be on a
line
by itself:
L1: mov ax,bx
L2:
Cont’
d
Operands: Assembly languageinstructions can have between zero and three
operands, each of which can be a register, memory operand, constant
expression
Comments: Descriptions
1. Single-line comments, beginning with a
semicolon character (;).
2. Block comments, beginning with the
COMMENT directive and a user-
specified symbol.
COMMENT ! This is comment line ! Or
COMMENT & comment line &

 The STC instruction, for example, has no operands: stc ; set Carry flag
 The INC instruction has one operand: inc eax ; add 1 to EAX
 The MOV instruction has two operands: mov count,ebx ; move EBX to count
 The imul instruction has Three operands: imul eax,ebx,5 ;In this case, EBX
is
multiplied by 5, and the product is stored in the EAX register.
Assembling, Linking, and Running
Programs
 A source program written in assembly language cannot be executed directly
on its target computer. It must be translated, or assembled into
executable code.
 The assembler produces a file containing machine language called an object
file. This file isn’t quite ready to execute. It must be passed to another
program called a linker, which in turn produces an executable file.
Defining Data
Types
It describes a set of values that can be assigned to variables and
expressions
of the given type.
 The essential characteristic of each type is its size in bits: 8, 16, 32, 48, 64
 A variable declared as DWORD, for example, logically holds an unsigned
32- bit integer. In fact, it could hold a signed 32-bit integer, a 32-bit
single precision real, or a 32-bit pointer. The assembler is not case
sensitive, so a directive such as DWORD can be written as dword, Dword,
dWord, and so on.
Instruction Mnemonic
 Is a short word that identifies an instruction. Assembly language instruction
mnemonics such as mov, add, and sub provide hints about the type
of operation they perform.
Following are examples of instruction mnemonics:
Data Definition
Statement
Syntax: name directive initializer [, initializer]...
 count DWORD 12345
 Directive: The directive in a data definition statement can be BYTE,
WORD, DWORD, SBYTE, SWORD, or DB,DW,DD,DQ.
 Initializer: At least one initializer is required in a data definition, even if it
is zero. Additional initializers, if any, are separated by commas. For integer
data types, initializer is an integer constant or expression matching the size
of the variable’s type, such as BYTE or WORD. If you prefer to leave the
variable uninitialized (assigned a random value), the ? symbol can be
used as the initializer. All initializers, regardless of their format, are
converted to binary data by the assembler.
Defining BYTE and SBYTE
Data
The BYTE (define byte) and SBYTE (define signed byte) directives
allocate storage for one or more unsigned or signed values. Each initializer
must fit into 8 bits of storage. For example:-
value1 BYTE 'A' ; character constant
value2 BYTE 0 ; smallest unsigned byte
value3 BYTE 255 ; largest unsigned byte
value4 /BYTE −128 ; smallest signed byte
value5 /BYTE +127 ; largest signed byte
 A question mark (?) initializer leaves the variable uninitialized, implying it
will
be assigned a value at runtime:
value6 BYTE ?
Cont’
dThe optional name is a label marking the variable’s offset from the
beginning of its enclosing segment. For example, if value1 is located at
offset 0000 in the data segment and consumes 1 byte of storage, value2 is
automatically located at offset 0001:
value1 BYTE 10h
value2 BYTE 20h
The DB directive can also define an 8-bit variable, signed or unsigned:
val1 DB 255 ; unsigned byte
val2 DB -128 ; signed byte
Multiple Initializers
 If multiple initializers are used in the same data definition, its label refers
only to the offset of the first initializer. In the following example, assume list is
located at offset 0000. If so, the value 10 is at offset 0000, 20 is at offset
0001, 30 is at offset 0002, and 40 is at offset 0003:
 list BYTE 10,20,30,40 list BYTE 10,20,30,40
BYTE 50,60,70,80
BYTE 81,82,83,84;multiple declaration
Within a single data definition, its initializers can use
different radixes. Character and string constants can
be freely mixed.
In the following example, list1 and list2 have the same
contents:
 list1 BYTE 10, 32, 41h, 00100010b
 list2 BYTE 0Ah, 20h, 'A', 22h
 greeting1 BYTE "Good afternoon",0
 greeting2 BYTE 'Good night',0
 Each character uses a byte of storage. Strings are an exception to the rule
that byte values must be separated by commas.
DUP
 Operator
The DUP operator allocates storage for multiple data items, using a
constant expression as a counter. It is particularly useful when allocating space for
a string or array, and can be used with initialized or uninitialized data:
 BYTE 20 DUP(0) ; 20 bytes, all equal to zero
 BYTE 20 DUP(?) ; 20 bytes, uninitialized
 BYTE 4 DUP("STACK") ; 4 bytes: " STACKSTACKSTACKSTACK "
.data Data with instruction
smallArray DWORD 10 DUP(0) ; 40 bytes – .code
.data? – mov eax,ebx
– .data
bigArray DWORD 5000 DUP(?) ; 20,000
– temp
bytes,
DWORD ?
not initialized – .code
.data – mov temp,eax
smallArray DWORD 10 DUP(0) ; 40 bytes
bigArray DWORD 5000 DUP(?) ; 20,000
bytes
Data Transfer
Instructions
Label is optional and can be ignored. Operands can be zero, one, two,
three
mnemonic ;no operands
mnemonic [destination] ;one operands
mnemonic [destination],[ source] ; two operands
mnemonic [destination],[ source-1],[ source-2] ; three operands
 x86 assembly language uses different types of instruction operands
Immediate—uses a numeric literal expression
Register—uses a named register in the CPU
Memory—references a memory location
Simple notation for operands
MOV Instruction
 The MOV instruction copies data from a source operand to a
destination
operand.
 Known as a data transfer instruction, it is used in virtually every program
 Syntax:- MOV destination, source
 The destination operand’s contents change, but the source operand is
unchanged. The right to left movement of data is similar to the
assignment statement in C++ or Java:dest = source;
 (In nearly all assembly language instructions, the left-hand operand is the
destination and the righthand operand is the source.)
 MOV is very flexible in its use of operands, as long as the following rules
are
observed:
Cont’
dBoth operands must be the same size.
 Both operands cannot be memory operands.
 CS, EIP, and IP cannot be destination operands.
 An immediate value cannot to a segment register.
 Here is a list of the general variants of MOV
Overlapping
 MOV reg,reg
Values:-
.data
 MOV mem,reg .data oneByte BYTE 78h
 MOV reg,mem var1 oneWord WORD
 MOV mem,imm WORD ? 1234h
 MOV reg,imm var2 oneDword DWORD
 MOV reg/mem16,sreg WORD ? 12345678h
.code
 MOV sreg,reg/mem16 .code
mov eax,0 ; EAX =
mov 00000000h
ax,var1 mov al,oneByte ; EAX = 00000078h
mov mov ax,oneWord ; EAX =
00001234h mov eax,oneDword ;
Addition & subtraction
 There are different types of arithmetic instruction sets. Let’s begin with INC
(increment), DEC (decrement), ADD (add), SUB (subtract), and NEG
(negate).
INC and DEC Instructions
 The INC (increment) and DEC (decrement) instructions, respectively, add 1
andsyntax
 The subtract
is:1 from a single operand. ADD Instruction
⚫ INC reg/mem Syntax: ADD dest,
⚫ DEC reg/mem source
.data
 Following are some examples:
var1 dw 10000h
– .data var2 dw 20000h
– myWord dw 1000h .code
– .code mov eax,var2
add eax,var1 ;; EAX
EAX==
– inc myWord ; myWord = 
10000h
30000h
Flags :-The Carry, Zero, Sign, Overflow,
1001h Carry, and Parity flags are changed according
Auxili
– mov bx,myWord t
value that is placed in the destination
– dec bx ; BX = 1000h
operand.
Cont’
d dest, source
 SUB
 Here is a short code example that subtracts two 32-bit integers:
– org 100h
– .data
– var1 DD 300h
– var2 DD 100h
– .code
– mov ax,var1; AX = 300h
– sub ax,var2 ; AX = 200h
– Ret
– Flags The Carry, Zero, Sign, Overflow, Auxiliary Carry, and Parity flags
are changed according to the value that is placed in the destination
operand.
NEG Instruction:
 TheNEG (negate) instruction reverses the sign of a number by converting the
number to its two’s complement. The following operands are permitted:
 NEG reg
Div Instruction
 It has one destination operand
 Syntax: Div S ;if S=8bit then Ax/S8 is possible to operate
 Q(Ax/S8)==> AL,R(Ax/S8)==> AH
 if S=16bit (DX,AX)/S16 is possible to operate
 Q(DX,Ax)/S16==> AX,R(DX,AX)/S16==>
1. .code DX
2. ; 5001d/25d Q=200,R=1
 mov
3. ; 1000d/100d Q=10,R=0 Ax,1004h
4. ; 55436d/568d ,Q=217,R=284  mov
dx,1004h

mov Bx,1234h
5. mov Ax,1389h  div
6. mov Bl,19h Bx
7. div bl ;10041004h/1234h
8. mov Ax,1000
10. div Bx
9. mov Bx,100
;10
Mult instruction
 The default 8 and 16 bit storage while performing multiplication is stored
in accumulative register and 32bit is stored in AX, and DX
(16,16)bit registers
 Syntax: Mul S
 Mul ;D8*AL==>
D8 AX
;D16*AX==>
 mov AL,14h
Dx,Ax mov
 mov BL,24h Ax,10
 MUL BL ;AH=02 mov
AL=D0 BX,10
 mov Ax,1234h MUL
 mov BX,45F4h BX ;AX=0
 MUL Ah
BX ;AX=5D90,DX=04F9
JMP & LOOP
Instructions
Assembly language programs use conditional instructions to implement
high- level statements such as IF statements and loops.
 Unconditional Transfer: Control is transferred to a new location in all
cases; a new address is loaded into the instruction pointer, causing
execution to continue at the new address.The JMP instruction does this. \
 Conditional Transfer: The program branches if a certain condition is true.
A wide variety of conditional transfer instructions can be combined to
create conditional logic structures.
 The CPU interprets true/false conditions based on the contents of the
ECX
and Flags registers.
Cont’
dJMP destination
top:
jmp top ; repeat the endless loop
JMP is unconditional, so a loop like this will continue endlessly
unless another way is found to exit the loop.
org 100h
.data
var1 DB "Hello World$"
.code
top:
mov ax,@data
mov ds,ax
mov dx, offset
var1
mov ah,09h
int 21h
jmp top
LOOP Instruction
 ECX is automatically used as a counter and is decremented each time the loop
repeats. Its syntax is LOOP destination
 The execution of the LOOP instruction involves two steps:
 First, it subtracts 1 from ECX. Next, it compares ECX to zero.
 If ECX is not equal to zero, a jump is taken to the label identified by
destination. if ECX equals zero, no jump takes place, and control passes to
the instruction following the loop.
 In the following example, we add 1 to AX each time the loop repeats.When
theloop ends, AX = 5 and ECX = 1. .data
0: 2. sum dw 1
mov ax,0 3. limit dw 5
4. .code
mov cx,5
5. mov ax,0
L1: 6. mov
inc ax cx,limit
loop L1 7. lable1:
8. ADD
10. ax,sum
loop lable1 ;output 0Fh,15d,1111b,17o
Nested
Loops
When creating a loop inside another loop, special consideration must be
given
to the outer loop counter in ECX.You can save it in a variable:
.data 2.
1. .data
count DWORD ? 3. count DW ?
org 100h
4. .code
.code 5. mov cx,5 ; set outer loop count
mov ecx,100 ; set outer loop 6. L1:
7. mov count,cx ; save outer loop count
count L1: 8. mov cx,4 ; set inner loop count
mov count,ecx ; save outer 9. L2:
loop count 10. mov ah,02h
11. mov dl,2Ah
mov ecx,20 ; set inner loop 12. int 21h
count 13. loop L2 ; repeat the inner loop
loop L2 ; repeat the inner
L2:
loop 14. mov cx,count ; restore outer loop count
mov ecx,count ; restore outer loop count 15. mov dl,0Dh ;line feed
loop L1 ; repeat the outer 16. int 21h
loop 17. mov dl,0Ah ;to Disp characters
18. int 21h
19. loop L1 ; repeat the outer loop
Array
sArrays can be seen as chains of variables. A text string is an example of a byte array,
each character is presented as an ASCII code value (0..255).
Here are some array definition examples:
a DB 48h, 65h, 6Ch, 6Ch, 6Fh, 00h
b DB 'Hello$’
When compiler sees a string inside quotes it automatically converts it to set of
bytes.This
chart shows a part of the memory where these arrays are declared:
Cont’
dYou can access the value of any element in array using square brackets, for
example:
MOV AL, a[3]
 You can also use any of the memory index registers, SI, DI, BP for
example: MOV SI, 3
MOV AL, a[SI]
 If you need to declare a large array you can use DUP operator.
The syntax for DUP:
number DUP (value(s))
 number - number of duplicate to make (any constant value).
 value - expression that DUP will duplicate.
for example:
c DB 5 DUP(9) is an alternative way of
declaring:
c DB 9, 9, 9, 9, 9
Cont’
d(Load Effective Address) instruction and alternative OFFSET operator are
 LEA
used to access the elements of an array. Both OFFSET and LEA can be used to
get the offset address of the variable.
 LEA is more powerful because it also allows you to get the address of an
indexed variables. Getting the address of the variable can be very useful in
some situations, for example when you need to pass parameters to a
procedure. 1. org
1. org
2. MOV
100h AL,VAR1 ; check value of VAR1 by moving it to 100hMOV AL,VAR1.
BX,VAR1 ; get address of VAR1 in 3. MOV BX, OFFSET
AL. 2.
BX.
BYTE PTR [BX], 44h ; modify the contents of VAR1
3. LEA
AL,VAR1 ; check value of VAR1 by moving it to AL.4.
VAR1. MOV BYTE PTR
4. MOV [BX], 44
5. MOV 5. MOV AL,VAR1
6.
6. ret
VAR1 DB 22h 6. RET
 Line number 3 LEA vs offset same 7. VAR1 DB 22h
functionality
Cont’
dIn the
  Isnext example,embedded
a command arrayB contains 3 bytes.
in the source codeAsthat
ESIisisrecognized
incremented,
and it points to
acted
each
byte,
uponin order:  If we use an array of 16-bit integers, we add 2 to ESI to
1. .data address each subsequent array element:
2. arrayB DB 1. org 100h
10h,20h,30h 2. .data
3. .code 3. arrayW DW 1000h,2000h,3000h
4. mov si,OFFSET 4. .code
arrayB 5. mov si, offset arrayW
5. mov al,[si] ; AL = 10h 6. mov ax,[si]; AX = 1000h
7. add si,2
6. inc si 8. mov ax,[si]; AX = 2000h
7. mov al,[si] ; AL = 20h 9. add si,2
8. inc si 10. mov ax,[si]; AX = 3000h
9. mov al,[si] ; AL = 30h 11. ret
10. ret
Cont’
dExample: Adding 32-Bit Integers: The following code example adds
three double words. A displacement of 4 must be added to ESI as it
points to each subsequent array value because Double words are 4 bytes
long:
– .data
–– arrayD
.code DWORD 10000h,20000h,30000h  Shows the initial value of ESI
– mov esi,OFFSET arrayD in
relation to the array data:
– mov eax,[esi] ; first number
– add esi,4
– mov eax,[esi] ; second
number
– add esi,4

–– mov eax,[esi]
Suppose arrayD; third number
is located at offset
10200h.
System Administration
Thank You…..!!!

You might also like