ps1 Crossassembler
ps1 Crossassembler
ON
CROSSASSEMBLER FOR 8085 MICROPROCESSOR
INSTRUCTION SET
BY
M.P.SUMANTH 2004B2A7623
A. ANIL KUMAR 2004P7PS165
D.DEEPAK 2004P7PS175
K.GIRIDHAR 2004T6PS351
AT
BHARAT DYNAMICS LIMITED
KANCHANBAGH, HYDERABAD
A Practice SchoolI station of
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI
(July, 2006)
1
A REPORT
ON
CROSSASSEMBLER FOR 8085 MICROPROCESSOR
INSTRUCTION SET
BY
Prepared in partial fulfillment of the
Practice School – I Course (BITS C221)
AT
BHARAT DYNAMICS LIMITED
KANCHANBAGH, HYDERABAD
A Practice SchoolI station of
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI
(July, 2006)
2
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE
PILANI (RAJASTHAN)
Practice School Division
Station: Bharat Dynamics Limited, Kanchanbagh Centre: Hyderabad
Duration: 53 Days Date of Start: 24 th May, 2006
Date of Submission: 12 th July, 2006
Title of the Project: CROSSASSEMBLER FOR 8085 MICROPROCESSOR
INSTRUCTION SET
Names and Identification Numbers:
M.P.SUMANTH 2004B2A7623 M.Sc Chemistry with C.S.E
A.ANIL KUMAR 2004P7PS165 B.E (Hons) C.S.E
D.DEEPAK 2004P7PS175 B.E (Hons) C.S.E
K.GIRIDHAR 2004T6PS351 M.Sc (Tech) Information
Systems
Name and Designation of the expert:
Mr. M. JAGADISH Manager, Information Technology Division, B.D.L
Name of the PS Faculty: Dr. T.Solomon Raju
Key Words: assembly language, instruction set, assembler, algorithm, coding.
Project Areas: Assembler
Abstract: The report includes definitions of assembling process related terms and the instruction
set of Intel 8085 Microprocessor. The report also presents an algorithm and ‘C’ language code
for an assembler with certain limitations along with an example.
Signature of the Students: Signature of PS Faculty:
M.P. Sumanth T. Solomon Raju
A. Anil Kumar Date: 12 th July, 2006
D. Deepak
K. Giridhar
Date: 12 th July, 2006
3
ACKNOWLEDGEMENTS
We acknowledge our special gratitude to Maj.General (Retd.) Rajnish Gossain, Chairman
and Managing Director, BDL, Kanchanbagh for providing us with an opportunity of
undergoing our Practice SchoolI program at this reputed company.
We are extremely thankful to DEAN PRACTICE SCHOOL DIVISION who was
generous to allot us this PS station.
We are thankful to Mr.M.S.K.Sharma, General Manager P&A, who gave his valuable
guidance, support and coordination to shape this PS programme. We are thankful to
Mr.N.Satyanarayana, Dy.Manager HRD, who had allotted us various divisions and
introduced us to them.
We are also thankful to Mr.G.Satynarayana, DGM, ITD, for his valuable guidance in the
PS program. We are thankful to Mr. M.Jagadish, Manager, ITD who has given his
constant support through out the PS program and for giving us insight into various
concepts of programming.
We express our deep sense of gratitude to our beloved instructor Dr.T.Solomon Raju and
student instructor Mr. Srikanth for guiding us in bringing out the report with beautiful
results.
4
TABLE OF CONTENTS
Abstract 3
Acknowledgements 4
1. Introduction
1.0 About the Organization 6
1.1 Information Technology Division (ITD) 7
1.1.0 Introduction 7
1.1.1 Servers and Network systems in ITD 8
1.1.2 Organization of ITD 9
2. The Assembler
2.0 The Assembly Language 10
2.1 About the Assembler 10
2.1.0 Introduction 10
2.1.1 History of Assemblers 10
2.1.2 Types of Assemblers 11
3. The Assembling Process 13
4. The Intel 8085 Microprocessor
4.0 Introduction 15
4.1 Instruction Set of 8085 Microprocessor 15
5. Algorithm for the Coding of CrossAssembler 21
6. Coding for CrossAssembler
6.0 ‘C’ code for Assembler 24
6.1 Lookup Table for the Assembler 42
6.2 Keywords Table 44
6.3 Jump/Label Instructions Table 45
6.4 Sample Assembly Code. 45
References 47
5
1. INTRODUCTION
1.0 ABOUT THE ORGANISATION:
BDL was incorporated in July 1970 with the prime objective of establishing a production
base for guided missiles and for the development of production technology in India. An
ambitious program to produce first generation anti tank wire guided missiles as a
forerunner for more sophisticated weapons was taken up under the license agreement
with M/s Aerospatiale, France.
BDL proved its mettle when its first batch of missiles produced was accepted by
the Indian army within one year of production. With this, India joined the select group of
countries in the field of missiles production.
In the meanwhile the mastering of missiles production technology has installed a
lot of confidence to go in for sophisticated anti tank second generation missiles, namely
Milan weapon system. To this effect a license agreement was signed with Europe’s
leading agreement manufacturing company M/s Euro missiles, France and the production
of the same has been commenced from January 1985 onwards. The Milan weapon system
is today the most widely used and envied weapon system in its class in the world. The
sensitivity and accuracy is unsurpassed with hit probability of 97%. Only four countries
in the world, India being the one of them today manufactures this weapon.
BDL has been assigned the production of Konkurs weapon system (Heavy
Armour Long Range second generation anti tank guided missiles) under the license
agreement entered into by Government of India, with the government of USSR. The
missiles are semi automatic command to line of sight guidance weapon. The project site
for the production of this weapon system is located in Bhanur village, Medak district,
Andhra Pradesh. The production of the same has commenced from June 1989 onwards.
Further, BDL has been nominated as the prime production agency for the projects
developed by the Defence Research Development Organization (DRDO) under the
IGMDP. The IGMDP of the DRDO envisages the development and establishing
6
Development Fabrication Facility (DFF) and limited series of production facility for the
following missile systems:
Table 1.1 Missiles and their characteristics
PRITHVI Surface to Surface missiles
TRISHUL Multi role tactile core vehicle
AKASH Medium range surface to air missile
NAG Third generation anti tank missiles
BDL has also diversified into the fields of rocket and electronics for the navy. A
number of systems like sonar modification kit, sonic ray plotter, RGB25 rocket, range
strobe unit and deep suspended target, project4 and NST58 torpedo have been produced
and delivered.
The dedicated mini computers along with NC operating system came into the
market has computer numerical control system. This interesting feature of memory, the
computer numerical control systems gave the new technology (software oriented system).
The invention of semiconductor memory devices has reduced in size of the system as
compared to the core memories used in the earlier mini computers. In the initial stage the
computer was associated as an offline data processing aid in the computer programming
parts.
1.1. INFORMATION TECHNOLOGY DIVISION (ITD)
1.1.0 Introduction to the ITD:
ITD with team of highly qualified experienced computer professionals provides campus
wide network based S/W solutions in the areas of production, finance, personnel and
administration. Its core competence is in the areas of RDBMS and diverse range of front
end toolsspanning D2K, VB, VC++, JAVA, ASP etc.
7
Some of the prime S/W solutions:
SAMAY Access control system linked from punch to pay
MANAV Complete HRM (Human Resource Management) information
from recruitment to retirement.
CHANAKYA Integrated product costing, budgeting and financial
accounting
VIKALPA ERP solution for manufacturing and materials
DHANVANTRY Hospital management system
VIGNAN Caters to the need of computerized library information system
Table 2.1 Names and the purpose of S/W solutions
1.1.1 Servers and Network systems in ITD
· There are 3 servers in ITD of which 2 systems are of Linux with 2GB RAM and 1
system of windows 2003 server with 1GB RAM. These are highly configured of
BULL Company imported from France.
Their main features are:
Ø The parallel computing feature
10 instructions can be executed at a time and the Tstate is very less.
Ø There are 4 processors on the motherboard.
· 8 GHz speed for Linux systems and 1.3 GHz for windows server with 64 GB
Hard disk.
· There are few other systems of low configuration of 256MB RAM 20GB hard
Disk.
· There are around 300400 systems covering the entire BDL kanchanbagh unit.
· All the buildings are connected to the main server through optical fibred system.
These run only one channel at a time. In India maximum 8 channels run
(Corresponding to VIBGYOR) but in other countries the channels may exceed 65,000 .
There is a printer in ITD that prints 1200 lines per minute. This is used for bulk
. Printing processes like pre printed pay rolls, bulk reports etc
8
· Information goes out from the main server to different buildings using optical
fibred system. From there the computer network can access the data by converting
the optical signals and through a series of HUBS and SWITCHES.
In switches the rate of data transfer remains same for all the computers in network
(100 Mbps or mega bits per second) unlike the hubs where the speed is distributed (10
Mbps).
· In BDL the switches are cascaded with hubs for effective data transfer and to
retain performance. 3 hubs are connected to a switch. The database in BDL is
mainly Oracle both in front and back end.
1.1.2 ORGANIZATION OF ITD
DEPUTY GENERAL MANAGER (G.SATYANARAYANA)
MANAGERS
Senior most team coordinator. Next to DGM and looks after ITD
SUBBARAJU
in his absence
PULLANNA hardware
online attendance system, HRMS(human resource management
R M GUPTA
system)
CHEENA system administration
SREENIVASULU ISO
JAGADISH: ERM ( Enterprise Resource Management ) module
Table 2.2 ITD members and the area of their work
9
2. THE ASSEMBLER
2.0 The Assembly Language
Assembly language, commonly called assembly, asm or symbolic machine code, is a
humanreadable notation for the machine language that specific computer architecture
uses. It is one step away from machine language. Machine language, a pattern of bits
encoding machine operations, is made readable by replacing the raw values with symbols
called mnemonics. Each assembly language statement is translated into one machine
instruction before it is executed on a machine. Assembly language is hardware
dependent; there is a different assembly language for each CPU series.
The distinguishing feature of assembly language is that corresponding to one
mnemonic there is only one machine code.
Instructions in assembly language are generally very simple. Any instruction that
references memory (for data or as a jump target) will also have an addressing mode to
determine how to calculate the required memory address
2.1 About the Assembler
2.1.0. Introduction to assemblers
Assembler is a computer program that takes as input a set of instructions written in
assembly language, and produces a corresponding executable computer program in
machine language. Assembler translates assembly instruction mnemonics into opcodes.
Besides, It provides the ability to use symbolic names for memory locations (saving
tedious calculations and manually updating addresses when a program is slightly
modified), and macro facilities for performing textual substitution.
2.1.1 History of Assemblers
Ø One of the first stored program computers was the EDSAC (Electronic Delay
Storage Automatic Calculator) developed at Cambridge University in 1949 by
Maurice Wilkes and W. Renwick. EDSAC which had an assembler, called Initial
Orders. It was implemented in a readonly memory formed from a set of rotary
telephone selectors, and it accepted symbolic instructions. Each instruction
consisted of a one letter mnemonic, a decimal address, and a third field that was a
10
letter. The third field caused one of 12 constants preset by the programmer to be
added to the address at assembly time.
Ø One of the first commercially successful computers was the IBM 704. It had
features such as floatingpoint hardware and index registers. It was first delivered in
1956 and its assembler, the UASAP1, was written in the same year by Roy Nutt of
United Aircraft Corp. It was a simple binary assembler, did practically nothing but
onetoone translation, and left the programmer in complete control over the
program.
Ø By the late fifties, IBM had released the 7000 series of computers. These came
with a macro assembler, SCAT that had all the features of modern assemblers. The
GAS (Generalized Assembly System) assembler was another powerful 7090
assembler.
2.1.2. Types of Assemblers
1. Based on the portability of the assemblers they are divided into two categories.
Self Assembler:
A selfassembler or resident assembler is an assembler which runs on the
microcomputer for which it produces object codes (machine codes).It is specific for the
system on which it is developed. It is permanently loaded in memory. Typically this kind
of assembler resides in ROM and is very simple (supports only a few directives and no
macros)
Cross Assembler:
A cross assembler generates machine language for a different type of computer than the
one the assembler is running in. To be clear it runs on a computer other than that for
which it produces object codes. It is used to develop programs for computers on a chip
or microprocessors used in specialized applications that are either too small or are
otherwise incapable of handling the development software. Many crossassemblers are
written in a higherlevel language to make them portable. They run on a large machine
and produce object code for a small machine.
11
2. Based on assembly process there are two types of assemblers.
OnePass Assembler:
An assembler which goes through an assembly language program only once is known as
onepass assembler. Such an assembler must have some technique to take the forward
references into account. Assembly language programs use labels that may appear later on
in the program. Such labels point to forward references. It is faster as it goes through a
program only once.
TwoPass Assembler:
An assembler that goes through an assembly language program twice is called a twopass
assembler. Such an assembler does not face difficulty with forward references. During
the first pass it collects all the labels. During the second pass it produces the machine
code for each instruction and assigns addresses to each of them.
3. Other categories of Assemblers
A MacroAssembler: One that supports macros
A MetaAssembler: One that can handle many different instruction sets.
A Disassembler: This, in a sense, is the opposite of an assembler. It translates machine
code into a source program in assembler language.
A highlevel assembler: This is a translator for a language combining the features of a
higherlevel language with some features of assembler language. Such a language can
also be considered a machine dependent higherlevel language.
A MicroAssembler: Used to assemble microinstructions. It is not different in principle
from an assembler. Note that microinstructions have nothing to do with programming
microcomputers.
12
3.THE ASSEMBLING PROCESS
The two pass assembly, the simplest to understand is described in the following flow
charts.
Process in passone:
Figure 3.1 Flow chart describing first pass in two pass assembler
13
Process in passtwo:
Picture 3.2 Flow chart describing second pass in two pass assembling process
14
4. THE INTEL 8085 MICROROCESSOR
4.0 Introduction to 8085 Microprocessor
Intel 8085 is an 8bit, NMOS microprocessor. It is a 40 pin IC. package fabricated on a
single LSI chip. Its clock speed is about 3 MHz. The clock cycle is of 320 ns. It has 80
basic instructions and 246 opcodes. It consists of three main sections.
1. Arithmetic and Logic unit
2. Timing and Control unit
3. Set of registers
4.1 Instruction set of 8085 Microprocessor
Instruction Naming Conventions:
The mnemonics assigned to the instructions are designed to indicate the function of the
instruction. The instructions fall into the following functional categories:
Data Transfer Group:
The data transfer instructions move data between registers or between memory and
registers.
MOV Move
MVI Move Immediate
LDA Load Accumulator Directly from Memory
STA Store Accumulator Directly in Memory
LHLD Load H & L Registers Directly from Memory
SHLD Store H & L Registers Directly in Memory
An 'X' in the name of a data transfer instruction implies that it deals with a register pair
(16bits);
15
LXI Load Register Pair with Immediate data
LDAX Load Accumulator from Address in Register Pair
STAX Store Accumulator in Address in Register Pair
XCHG Exchange H & L with D & E
XTHL Exchange Top of Stack with H & L
Arithmetic Group:
The arithmetic instructions add, subtract, increment, or decrement data in registers or
memory
ADD Add to Accumulator
ADI Add Immediate Data to Accumulator
ADC Add to Accumulator Using Carry Flag
ACI Add immediate data to Accumulator Using Carry
SUI Subtract Immediate Data from Accumulator
SBB Subtract from Accumulator Using Borrow (Carry) Flag
SBI Subtract Immediate from Accumulator Using Borrow (Carry) Flag
INR Increment Specified Byte by One
DCR Decrement Specified Byte by One
INX Increment Register Pair by One
16
DCX Decrement Register Pair by One
DAD Double Register Add; Add Content of Register
Pair to H & L Register Pair
Logical Group:
This group performs logical (Boolean) operations on data in registers and memory and on
condition flags.
The logical AND, OR, and Exclusive OR instructions enable you to set specific bits in
the accumulator ON or OFF.
ANA Logical AND with Accumulator
ANI Logical AND with Accumulator Using Immediate Data
ORA Logical OR with Accumulator
OR Logical OR with Accumulator Using Immediate Data
XRA Exclusive Logical OR with Accumulator
XRI Exclusive OR Using Immediate Data
The Compare instructions compare the content of an 8bit value with the contents of the
accumulator;
CMP Compare
CPI Compare Using Immediate Data
The rotate instructions shift the contents of the accumulator one bit position to the left or
right:
17
RLC Rotate Accumulator Left
RRC Rotate Accumulator Right
RAL Rotate Left through Carry
RAR Rotate Right through Carry
Complement and carry flag instructions:
CMA Complement Accumulator
CMC Complement Carry Flag
STC Set Carry Flag
Branch Group: The branching instructions alter normal sequential program flow, either
unconditionally or conditionally. The unconditional branching instructions are as follows:
JMP Jump
CALL Call
RET Return
Conditional branching instructions examine the status of one of four condition flags to
determine whether the specified branch is to be executed. The conditions that may be
specified are as follows:
NZ Not Zero (Z = 0)
Z Zero (Z = 1)
NC No Carry (C = 0)
C Carry (C = 1)
18
PO Parity Odd (P = 0)
PE Parity Even (P = 1)
P Plus (S = 0)
M Minus (S = 1)
The conditional branching instructions are specified as follows:
Table 4.1 Conditional branching instructions
Two other instructions can affect a branch by replacing the contents or the program
counter:
PCHL Move H & L to Program Counter
RST Special Restart Instruction Used with Interrupts
Stack I/O, and Machine Control Instructions:
The following instructions affect the Stack and/or Stack Pointer:
PUSH Push Two bytes of Data onto the Stack
19
POP Pop Two Bytes of Data off the Stack
XTHL Exchange Top of Stack with H & L
SPHL Move content of H & L to Stack Pointer
The I/0 instructions are as follows:
IN Initiate Input Operation
OUT Initiate Output Operation
The Machine Control instructions are as follows:
EI Enable Interrupt System
DI Disable Interrupt System
HLT Halt
NOP No Operation
20
5. ALGORITHM FOR THE CODING OF CROSSASSEMBLER
The algorithm presented below is to write code for a two pass crossassembler in high level
languages(C,C++,JAVA etc).
ALGORITHM
The algorithm is divided into two parts namely PASS 1 and PASS 2.
PASS 1:
§ IN THE MAIN FUNCTION
Ø Start
Ø Read a line from the source code into array named arr.
Ø Read the first word into array named arr1 and sends it to the search_lab
Function.
§ IN THE SEARCH_LAB FUNCTION
Ø The arr1 contents are compared with ‘.ORIG’.
ü If match is found then ‘m’ value is set to ‘4’ and the function returns ‘0’ to
the main function.
ü If no match found then a ‘keyword’ file is accessed and its contents are
Compared to the arr1 content. If match is found then the function Returns ‘1’
to the main Function.
Ø Before returning ‘1’ to the main function, a ‘jump’ file is accessed and
Its contents are compared with the arr1 content.
ü If match is found, the
Address is updated, bytes are updated and m value is set to ‘2’and the
Function returns ‘0’ to the main function.
ü If match is not found then it compares the arr1 contents with ’IN’ and
‘OUT’ instructions.
ü If match is found then address is updated, bytes are updated and m value is
set to ‘3’and returns ‘0’ to the main function.
Ø If no match is found in ‘keyword’ file, then ‘valid_lab’ function is called,
It checks the validity of the label. If valid then the functions returns
‘0’.if not the function prints ‘invalid label’ and returns ‘1’.
21
The function checks the returned value from the valid_lab function.
ü If it is ‘0’, a file named ‘label_tab’ is opened, the arr1 contents and the
Address are printed into the file and the search_lab functions returns ‘0’ to the
Main function.
ü If it is ‘1’, the search_lab function just returns ‘0’ to the main function.
§ IN THE MAIN FUNCTION
Ø The function checks the value returned by search_lab function.
ü If it is ‘0’, arr1 is cleared.
ü If not, the whole line is read into the array arr1 and sent to the function
Chg_data_addr.
.
Ø The content of ‘m’ is checked, if it is ‘4’,then the main function calls a
Function named hex. And the address is updated.
§ IN THE CHG_DATA_ADDR FUNCTION
Ø In this function the array arr1 is checked if it contains any address instruction
Or any data instruction.
ü If it contains any data instruction then the value of v is Set to ‘1’.
ü If it contains any address instruction then the value of v is set to ‘0’ and the
Value of n is set to ‘1’.
Ø Then the bytes are updated and the address is updated. The arr1 content is
compared with the ‘HLT’, if match is found then the function Returns ‘1’,
other wise it returns ‘0’.
§ IN THE HEX FUNCTION
Ø The hex value is converted into numeric value and stored in address.
The PASS 1 ends.
PASS 2:
§ IN THE MAIN FUNCTION
Ø Start
Ø Read a line from the source code into array named arr.
Ø Read the first word into array named arr1 and sends it to the search_lab
22
Function.
§ IN THE SEARCH_LAB FUNCTION
Ø The process continues same as in pass1 and returns values to the main function.
Except that it won’t open and write the labels and their addresses into a
Label_tab file as in Pass1.
Ø In pass2, this function calls look_search to update
the bytes, which prints the address and opcode into the output file.
§ IN THE MAIN FUNCTION
Ø The function checks the value returned by search_lab function.
ü If it is ‘0’, arr1 is cleared.
ü If not, the whole line is read into the array ‘arr1’ and sent to the function
chg_data_addr.
Ø The value of ‘m’ is compared.
ü If it is ‘2’ then the main function calls a function named branch.
ü If it is ‘3’ then the main function calls a function named port.
ü If it is ‘4’ then the main function calls a function named hex.
Ø The output file is printed on the screen and any temporary files used are deleted.
§ IN THE CHG_DATA_ADDR FUNCTION
Ø The same process continues as in pass1 and the function at the end compares
The values of ‘v’ and accordingly it prints the address or data given in the
instruction into the output file.
§ IN THE BRANCHI FUNCTION
Ø The function opens the label_tab file and compares the contents of the file with
the arr1 content. If match is found then the function prints the address of that
Label into the output file.
§ IN THE PORT FUNCTION
Ø The function just prints the value of port address into the output file.
§ IN THE HEX FUNCTION
Ø The process continues same as in pass1.
The PASS 2 ends
23
6. CODING FOR CROSSASSEMBLER
6.0 ‘C’ CODE FOR ASSEMBLER
Filename: assembler.c
#include<stdio.h>
#include<string.h>
/*functions declarations*/
int search_lab(char []); /*to deal with appropriate tokens*/
int valid_lab(char []); /*search for labels validity*/
int look_search(char []); /*fetch opcode from lookuptable*/
char *addr_update(int); /*for updating address*/
int chg_data_addr(char *); /*for dealing with data and address involved instructions*/
void branch(char []); /*to deal with branch instructions*/
void port(char []); /*to deal with port instructions*/
void hex(char []); /*for converting hex to decimal numbers*/
/*global variables declarations*/
int PASS=1; /*keeps track of the pass*/
int bytes=0,m=0; /*bytes:keeps track of total number of bytes of code read*/
char *p="1000"; /*it is the default address where the assembled code
starts*/
int ADDR=4096; /*decimal value of 1000H*/
FILE *fp3; /*file pointer for file in which machine code is put*/
void main(int argc,char *argv[])
{ int i=0,k=0,j=0,l=0,d=0,x=2,lines=0; /*lines: is used to keep track of the number of
lines*/
char arr[50]={'\0'},arr1[10]={'\0'},c=' '; /*arr: to read a line,arr1: to read a token*/
FILE *fp,*fp1;
fp3=fopen("argv[2]","w"); /*opening the output file(machine code file)*/
fp=fopen(argv[1],"r+"); /*open the input file*/
if((fp1=fopen("label_tab","w"))==0) /*close labels file if it exits*/
24
{remove("label_tab");
fclose(fp1);
}
while(feof(fp)==0) /* first pass starts*/
{ j=0;
while(j<=49) /*clearing arr*/
{arr[j]='\0';
j++;}
j=0;
while(j<=9) /*clearing arr1*/
{arr1[j]='\0';
j++;}
k=0; j=0; /*reinitializing the variables*/
i=0; d=0;
fgets(arr,50,fp); /*reading a line from the input file*/
while(i<50) /*loop to break the line into tokens*/
{ if(arr[i]==';') /*terminate to read the line if ';' is encountered*/
break;
else if((arr[i])==' ') /*skip spaces*/
{
if(j==1) /*a string is read*/
{ l=search_lab(arr1); /*send a string/token to search_lab() function*/
if(l==0) /*continue to read the next string*/
{k=0;
j=0;
while(d<9) /*clear arr1*/
{ arr1[d]='\0';
d++;
}
i++;
continue;
25
}
else
{ j++;
i++;
continue; } } }
else
{
arr1[k]=arr[i]; /*copy character from arr into arr1*/
i++;
k++;
arr1[k]='\0';
if(j==0)
j=1; /*indicator for a string completion*/
}
}
if(l==1) /*deal with mnemonic concatenated with
operands*/
{
x=chg_data_addr(arr1); /*deals with address and data involved instructions*/
p=addr_update(bytes); /*update the address*/
if(x==1)
break; /*break when HALT instruction is encountered*/
}
if(m==4) /*to check if input file starts with an address*/
{
hex(arr1); /*to change address from default 1000 to specified*/
p=addr_update(bytes);
}
m=0;
} /*loop to read the lines stops*/
26
fclose(fp); /*closing the input file*/
bytes=0; /*reinitializing variables after pass 1*/
i=0;
while(i<4)
{
p[i]='\0';
i++;
}
i=0;
PASS=2; /*changing PASS from 1 to 2*/
fp=fopen(argv[1],"r+"); /*reopening the input file*/
while(feof(fp)==0) /*PASS 2 starts*/
{ j=0;
while(j<=49) /*clearing arr*/
{arr[j]='\0';
j++;}
j=0;
while(j<=9) /*clearing arr1*/
{arr1[j]='\0';
j++;}
k=0; j=0; /*reinitializing the variables*/
i=0; d=0;
fgets(arr,50,fp); /*reading a line from the input file*/
lines++; /*update the number of lines read*/
while(i<50) /*loop to break the line into tokens*/
{ if(arr[i]==';') /*terminate to read the line if ';' is encountered*/
break;
else if((arr[i])==' ') /*skip spaces*/
{
if(j==1) /*a string is read*/
{ l=search_lab(arr1); /*send a string/token to search_lab() function*/
27
if(l==0) /*continue to read the next string*/
{k=0;
j=0;
while(d<9) /*clear arr1*/
{ arr1[d]='\0';
d++;
}
i++;
continue;
}
else
{ j++;
i++;
continue;
}
}
}
else
{
arr1[k]=arr[i]; /*copy character from arr into arr1*/
i++;
k++;
arr1[k]='\0';
if(j==0) /*indicator for a string completion*/
j=1;
}
}
if(l==1) /*deal with mnemonic concatenated with
operands*/
{ p=addr_update(bytes); /*update the address*/
x=chg_data(arr1);
28
/*deals with address and data involved
instructions*/
if(x==1) /*break when HALT instruction is encountered*/
break;
}
if(m==2) /*to deal with branch instruction*/
{ branch (arr1);
}
if(m==3) /*to deal with port instruction*/
{port(arr1);}
if(m==4) /*to check if output file has start address*/
{
hex(arr1);
}
m=0; /*reinitializing m*/
}
fclose(fp);
fclose(fp3); /*closing input and output files*/
if((fp1=fopen("label_tab","r"))==0)
{remove("label_tab");
fclose(fp1);}
/*print the number of lines in the program*/
printf("The total lines in the program are %d\n",lines);
system("cat exe"); /*displaying the output file*/
}
int search_lab(char s[15])
{ FILE *fp1,*fp,*fp5;
int i=1,k=0,j=0,a;
int t=0;
29
char d[5]={'\0'},x[5]=".ORIG";
while(k<5)
{ if ((s[k]==x[k])&&((s[i]!='\0')||(s[i]!='\n')))
{
k++;
}
else
break;
}
if(k==5) /*check if string is .ORIG*/
{ m=4;
return 0;
}
k=0;
fp1=fopen("keywords","r+"); /*open the keywords file*/
while(k<8)
{ if((s[k]=='\0')||(s[k]=='\n'))
s[k]='\0';
k++; }
k=0;
do
{ if((fscanf(fp1,"%s",d))==1) /*fetch a word from the file*/
{ }
else
break;
while(k<5)
{ if (s[k]==d[k])
{
k++;
}
else
30
break;
}
if(k==5) /*check if string is mnemonic*/
{ k=0;
fp=fopen("jump","r+"); /*open the jump mnemonics file*/
while(feof(fp)==0)
{
fscanf(fp,"%s",d);
if(strcmp(s,d)==0) /*check if the string is branch mnemonic*/
{ p=addr_update(bytes);
bytes+=look_search(s); /*fetch opcode for the instruction*/
fclose(fp);
m=2;
return 0;
}
}
if((strcmp(s,"IN")==0)||(strcmp(s,"OUT")==0))
/*check if the mnemonic is port related*/
{ p=addr_update(bytes);
bytes+=look_search(s); /*fetch opcode for the instruction*/
fclose(fp);
m=3;
return 0;
}
return 1;
break;
}
k=0;
} while(1);
fclose(fp1);
fclose(fp);
31
j=valid_lab(s);
if(PASS==1)
{
if(j!=1)
{
fp1=fopen("label_tab","a+"); /*open labels table to enter labels and address*/
fprintf(fp1,"%s %s",s,p);
a= ftell(fp1);
t++;
while(a<=(t*15)) /*print spaces for rest of 15 cursor positions*/
{ fprintf(fp1," ");
a++;}
fclose(fp1);
return 0;
} }
else
return 0;
}
int look_search(char temp1[10])
{
int i=0,j=0,k=0;
int bytes1=0;
char opcode[3]="AA",temp2[9]="A";
FILE *fp;
fp=fopen("look_uptab","r+"); /*open lookup_table*/
while(feof(fp)==0)
{
while(k<10)
32
{
temp2[k]='\0';
k++;
}
if((fscanf(fp,"%s",temp2))==1) /*fetch a string from lookup_table*/
{
}
else
break;
i=0;
while(i<9)
{ if(temp1[i]=='\n')
{temp1[i]='\0';
break;
}
i++;
}
if((strcmp(temp1,temp2))==0) /*comparing sent string with that fetched
from lookup_table*/
{ fscanf(fp,"%s%d",opcode,&bytes1); /*fetch opcode and bytes*/
if(PASS==2)
{ fprintf(fp3,"%s %s",p,opcode);} /*print opcode in output file*/
break;
}
}
fclose(fp); /*close the lookup_table*/
return bytes1;
}
char *addr_update(int bytes1)
{
int a=0,x[4]={0},i,k=1,r=0,q=0;
33
long int dec=0;
char *s="A";
for(i=0;i<4;i++) /*start reading the address*/
{
if((s[i]>='0') && (s[i]<='9')) /*check if the characters are numerals*/
{
a=s[i];
x[i]=a48; /*numeric value*/
}
else if(s[i]=='A')
x[i]=10;
else if(s[i]=='B')
x[i]=11;
else if(s[i]=='C')
x[i]=12;
else if(s[i]=='D')
x[i]=13;
else if(s[i]=='E')
x[i]=14;
else
x[i]=15;
}
dec=((x[0]*4096)+(x[1]*256)+(x[2]*16)+x[3]); /*decimal value of address*/
dec=ADDR+bytes1; /*add bytes to decimal address*/
i=0;
a=0;
do /*converting decimal address back into hex*/
{ q=(dec/16);
r =(dec%16);
x[i]=r;
x[i+1]=q;
34
i++;
dec=q;
}
while(q>=16);
for(i=0;i<=3;i++)
{ if((x[i]>=0)&&(x[i]<=9))
{
a=x[i];
s[3i]=a+48; }
if((x[i]>=10)&&(x[i]<=15))
{ if(x[i]==10)
s[3i]='A';
else if(x[i]==11)
s[3i]='B';
else if(x[i]==12)
s[3i]='C';
else if(x[i]==13)
s[3i]='D';
else if(x[i]==14)
s[3i]='E';
else
s[3i]='F';
}
}
s[i]='\0';
return(s); /*return updated address*/
}
int chg_data(char *s)
{
char d[2]={'\0'},a1[2]={'\0'},a2[2]={'\0'},x[3]="HLT";
35
int i,n=0,v=2;
for(i=0;i<10;i++) /*loop to read the sent token*/
{ if(s[i]==',')
{if((((s[i+1]>='0')&&(s[i+1]<='9'))||((s[i+1]>='A')&&(s[i+1]<='F')))&&(((s[i+2]>='0')
&&(s[i+2]<='9'))||((s[i+2]>='A')&&(s[i+2]<='F')))&&(((s[i+3]>='0')&&(s[i+3]<='9'))
||((s[i+3]>='A')&&(s[i+3]<='F')))) /*deal with the data instructions
of type xxx REG,ADDR*/
{
a1[0]=s[i+3]; a1[1]=s[i+4]; a2[0]=s[i+1]; a2[1]=s[i+2];
s[i+1]='A'; s[i+2]='D'; s[i+3]='D'; s[i+4]='R'; s[i+5]='\0';
v=0;
n=1;
break;
}
else if((((s[i+1]>='A')&&(s[i+1]<='E'))||(s[i+1]=='H')||(s[i+1]=='L')||(s[i+1]=='M'))
&&((s[i+2]=='\0')||(s[i+2]=='H')))
break;
else
if((((s[i+1]>='0')&&(s[i+1]<='9'))||((s[i+1]>='A')&&(s[i+1]<='F')))&&(((s[i+2]>='0')
&&(s[i+2]<='9'))||((s[i+2]>='A')&&(s[i+2]<='F'))))
/*deal with the address instructions
of type xxx REG,DATA*/
{ d[0]=s[i+1]; d[1]=s[i+2];
s[i+1]='D'; s[i+2]='A'; s[i+3]='T'; s[i+4]='A'; s[i+5]='\0';
v=1;
break;
}
} }
if(((s[0]=='L')&&(s[1]=='D')&&(s[2]=='A')&&(s[3]!='X'))||((s[0]=='S')&&(s[1]=='T')
&&(s[2]=='A'))) /*deal with the address instructions
of type xxx ADDR*/
36
{
a1[0]=s[5]; a1[1]=s[6]; a2[0]=s[3]; a2[1]=s[4];
s[3]='A'; s[4]='D'; s[5]='D'; s[6]='R';
v=0;
n=1;
s[7]='\0';
}
if(((s[0]=='L')&&(s[1]=='H')&&(s[2]=='L')&&(s[3]=='D'))||((s[0]=='S')&&(s[1]=='H')
&&(s[2]=='L')&&(s[3]=='D'))||((s[0]=='C')&&(s[1]=='A')&&(s[2]=='L')&&(s[3]=='L')))
{
a1[0]=s[6]; a1[1]=s[7]; a2[0]=s[4]; a2[1]=s[5]; /*deal with the address instructions
of type xxx ADDR*/
s[4]='A'; s[5]='D'; s[6]='D'; s[7]='R';
v=0;
n=1;
s[8]='\0';
}
if(n!=1)
{
i=3;
if((((s[i]>='0')&&(s[i]<='9'))||((s[i]>='A')&&(s[i]<='F')))&&(((s[i+1]>='0')
&&(s[i+1]<='9'))||((s[i+1]>='A')&&(s[i+1]<='F')))) /*deal with the address instructions
of type xxx DATA*/
{ d[0]=s[i]; d[1]=s[i+1];
s[i]='D'; s[i+1]='A'; s[i+2]='T'; s[i+3]='A'; s[i+4]='\0';
v=1;
} }
bytes+=look_search(s); /*call look_search() function with 's'
string*/
if(PASS==2)
37
{ if(v==0) /*if it is address instruction then print
address*/
fprintf(fp3,",%s,%c%c\n",a1,a2[0],a2[1]);
else if(v==1) /*if it is address instruction then print
address*/
fprintf(fp3,",%c%c\n",d[0],d[1]);
else
fprintf(fp3,"\n");
}
p=addr_update(bytes);
if((strcmp(s,x))==0) /*check if the mnemonic is HALT*/
return 1;
else
return 0;
}
void port(char x[10])
{
fprintf(fp3,",%s",x); /*printing the value of port instructions*/
}
void branch (char x[10])
{
int n=0,q=0,k=0;
char t[10]={'\0'},e[10]={'\0'};
FILE *fpx;
fpx=fopen("label_tab","r"); /*opening the labels file*/
while(feof(fpx)==0) /*start reading the labels file*/
{
while(k<10)
{
38
t[k]='\0';
k++;
}
if((fscanf(fpx,"%s",t))==1) /*reading a string from labels file*/
{
}
else
break;
n=0;
while(n<10)
{ if(x[n]=='\n')
{x[n]='\0';
break;
}
n++;
}
if((strcmp(x,t))==0) /*compare the label from file with sent label*/
{
fscanf(fpx,"%s",e); /*read the address of the label from the file*/
fprintf(fp3,",%c%c,%c%c\n",e[2],e[3],e[0],e[1]);
break; /*print address of the label in output file*/
}
}
fclose(fpx); /*close the labels file*/
}
int valid_lab(char a1[10])
{ int i=1;
if((strcmp(a1,".ORIG"))==0)
return 0; /*skip checking for validity if string is .ORIG*/
39
if(((a1[0]>='!')&&(a1[0]<='/'))) /*check for validity of labels*/
{ printf("2.invalid label\n");
return 1; } /*return '1' if the label is invalid*/
else if((a1[0]>=':')&&(a1[0]<='@'))
{ printf("2.invalid label\n");
return 1;}
else if((a1[0]>='0')&&(a1[0]<='9'))
{ printf("2.invalid label\n");
return 1;}
else if((a1[0]>='[')&&(a1[0]<='^'))
{ printf("2.invalid label\n")
return 1;}
else if((a1[0]>='{')&&(a1[0]<='~'))
{printf("2.invalid label\n");
return 1;}
else{
for(i=1;i<5;i++)
{if(((a1[i]>='!')&&(a1[i]<='/')))
{
printf("3.invalid label\n");
return 1;
}
else if((a1[i]>=':')&&(a1[i]<='@'))
{ printf("3.invalid label\n");
return 1;}
else if((a1[i]>='[')&&(a1[i]<='^'))
{ printf("3.invalid label\n");
return 1;}
else if((a1[i]>='{')&&(a1[i]<='~'))
{ printf("3.invalid label\n");
return 1;}
40
} }
return 0; /*return '0' if it is a valid label*/
}
void hex(char s[10])
{
int a,x[4]={0},i,k=1 ;
long int dec=0;
for(i=0;i<4;i++) /*loop to read the input hex string*/
{
if((s[i]>='0') && (s[i]<='9')) /*check if the character is a numeral*/
{
a=s[i];
x[i]=a48; /*put numeric value of numeral into another array*/
}
else if((s[i]=='a') || (s[i]=='A') ) /*check for valid characters in hexadecimal*/
x[i]=10;
else if((s[i]=='b') || (s[i]=='B'))
x[i]=11;
else if((s[i]=='c') || (s[i]=='C'))
x[i]=12;
else if((s[i]=='d') || (s[i]=='D'))
x[i]=13;
else if((s[i]=='e') || (s[i]=='E'))
x[i]=14;
else if((s[i]=='f') || (s[i]=='F'))
{x[i]=15;}
else{
k=0;
break;}
41
}
if(k==0)
printf("invalid");
else
{dec=x[0]*4096+x[1]*256+x[2]*16+x[3]; /*evaluate decimal value of hex number*/
ADDR=dec;
}
}
6.1 LOOKUP TABLE FOR THE ASSEMBLER:
Filename: look_uptab
Format: mnemonic<data/address/operands> <opcode> <bytes> <instruction cycles>
42
INRE 1C 1 4 INRH 24 1 4 INRL 2C 1 4 INRM 34 1 10
INXB 03 1 6 INXD 13 1 6 INXSP 33 1 6 JC DA 3 10
JM FA 3 10 JMP C3 3 10 JNC D2 3 10 JNZ C2 3 10
JP F2 3 10 JPE EA 3 10 JPO E2 3 10 JZADDR 3A 3 13
JZ CA 3 10 LDAADDR 3A 3 13 LDAXB 0A 1 7 LDAXD 1A 1 7
LHLDADDR 2A 3 16 LXIB 01 3 10 LXID 11 3 10 LXIH 21 3 10
LXISP 31 3 10 MOVA,A 7F 1 4 MOVA,B 78 1 4 MOVA,C 79 1 4
MOVA,D 7A 1 4 MOVA,E 7B 1 4 MOVA,H 7C 1 4 MOVA,L 7D 1 4
MOVA,M 7E 1 7 MOVB,A 47 1 4 MOVB,E 40 1 4 MOVB,C 41 1 4
MOVB,D 42 1 4 MOVB,E 43 1 4 MOVB,H 44 1 4 MOVB,L 45 1 4
MOVB,M 46 1 7 MOVC,A 4F 1 4 MOVC,B 48 1 4 MOCVC,C 49 1 4
MOVC,D 4A 1 4 MOVC,E 4B 1 4 MOVC,H 4C 1 4 MOVC,L 4D 1 4
MOVC,M 4E 1 7 MOVD,A 57 1 4 MOVD,B 50 1 4 MOVD,C 51 1 4
MOVD,D 52 1 4 MOVD,E 53 1 4 MOVD,H 54 1 4 MOVD,L 55 1 4
MOVD,M 56 1 7 MOVE,A 5F 1 4 MOVE,B 58 1 4 MOVE,C 59 1 4
MOVE,D 5A 1 4 MOVE,E 5B 1 4 MOVE,H 5C 1 4 MOVE,L 5D 1 4
MOVE,M 5E 1 7 MOVH,A 67 1 4 MOVH,B 60 1 4 MOVH,C 61 1 4
MOVH,D 62 1 4 MOVH,E 63 1 4 MOVH,H 64 1 4 MOVH,L 65 1 4
MOVH,M 66 1 7 MOVL,A 6F 1 4 MOVL,B 68 1 4 MOVL,C 69 1 4
MOVL,D 6A 1 4 MOVL,E 6B 1 4 MOVL,H 6C 1 4 MOVL,L 6D 1 4
MOVL,M 6E 1 7 MOVM,A 77 1 7 MOVM,B 70 1 7 MOVM,C 71 1 7
MOVM,D 72 1 7 MOVM,E 73 1 7 MOVM,H 74 1 7 MOVM,L 75 1 7
MVIA,DATA 3E 2 7 MVIB,DATA 06 2 7 MVIC,DATA 0E 2 7 MVID,DATA 16 2 7
MVIE,DATA 1E 2 7 MVIH,DATA 26 2 7 MVIL,DATA 2E 2 7 MVIM,DATA 36 2 10
NOP 00 1 4 ORAA B7 1 4 ORAB B0 1 4 ORAC B1 1 4
ORAD B2 1 4 ORAE B3 1 4 ORAH B4 1 4 ORAL B5 1 4
ORAM B6 1 7 ORIDATA F6 2 7 OUT D3 2 10 PCHL E9 1 6
POPB C1 1 10 POPD C1 1 10 POPH E1 1 10 POPPSW F1 1 10
PUSHB C5 1 12 PUSHD D5 1 12 PUSHH E5 1 12 PUSHPSW F5 1 12
RAL 17 1 4 RAR 1F 1 4 RC D8 1 12 RET C89 1 10
RIM 20 1 4 RLC 07 1 4 RM F8 1 12 RNC D0 1 12
43
RNZ C0 1 12 RP F0 1 12 RPE E8 1 12 RPO E0 1 12
RRC 0F 1 4 RST0 C7 1 12 RST1 CF 1 12 RST2 D7 1 12
RST3 DF 1 12 RST4 E7 1 12 RST5 EF 1 12 RST6 F7 1 12
RST7 FF 1 12 RZ C8 1 12 SBBA 9F 1 4 SBBB 98 1 4
SBBC 99 1 4 SBBD 9A 1 4 SBBE 9B 1 4 SBBH 9C 1 4
SBBL 9D 1 4 SBBM 9E 1 7 SBIDATA DE 2 7 SHLDADDR 22 3 16
SIM 30 1 4 SPHL F9 1 6 STAADDR 32 3 13 STAXB 02 1 7
STAXD 12 1 717 STC 37 1 4 SUBA 97 1 4 SUBB 90 1 4
SUBC 91 1 4 SUBD 92 1 4 SUBE 93 1 4 SUBH 94 1 4
SUBL 95 1 4 SUBM 96 1 7 SUIDATA D6 2 7 XCHG EB 1 4
XRAA AF 1 4 XRAB A8 1 4 XRAC A9 1 4 XRAD AA 1 4
XRAE AB 1 4 XRAH AC 1 4 XRAL AD 1 4 XRIM AE 1 7
XRIDATA EE 2 7 XTHL E3 1 16 INXH 23 1 6 LXIH,DATA 21 3 18
LXID,DATA 11 3 18 LXIB,DATA 01 3 10
LXIH, ADDR 21 3 10 LXIB, ADDR 01 3 10
LXID, ADDR 11 3 10 CALLADDR CD 3 18
6.2 KEYWORDS TABLE:
Filename: keywords
The table contains all the mnemonics of 8085 instruction set architecture (ISA).
ACI ADC ADD ADI ANA ANI CALL CC CM CMA CMC CMP CNC
CP CPE CPI CPO CZ DAA DAD DCR DCX DI EI HLT IN
INR INX JC JM JMP JNC JNZ JP JPE JPO JZ LDA LDAX
LHLD LXI MOV MVI NOP ORA ORI OUT PCHL POP PUSH RAL RAR
RC RET RIM RLC RM RNC RNZ RP RPE RPO RRC RST RZ
SBB SBI SHLD SIM SPHL STA STAX STC SUB SUI XCHG XRA XRI
XTHL
44
6.3 JUMP/LABEL INSTRUCTIONS TABLE:
Filename: jump
The table contains the mnemonics of labels/jump related instructions.
CC CM CNC CNZ CP CPE CPO CZ JC JM JMP JNC JNZ JP JPE JPO JZ
6.4 SAMPLE ASSEMBLY CODE:
An assembly language code for multiplication of two numbers (16 Bit numbers) is given
as a sample code.
Steps to run the sample code:
1. Type the following assembly language code in a file and save it with the filename
‘sample’ in the same directory as that of the assembler.
Filename: sample
.ORIG 2000H
LHLD 2501H ; get multiplicand in HL pair
XCHG ; multiplicand in DE pair
LDA 2503H ; multiplier in accumulator
LXI H, 0000 ; initial value of product=00 in HL pair
MVI C, 08 ; count=8 in register C
LOOP DAD H ; shift partial product left by 1 digit
RAL ; rotate multiplier left one bit. Is multiplier’s bit=1?
JNC AHEAD ; No, go to AHEAD
DAD D ; product =product +multiplicand
AHEAD DCR C ; decrement count
JNZ LOOP
SHLD 2504H ; store result
HLT ; stop
2. Type the following commands
UNIX platform
$cc assembler.c <return>
$. /a.out sample output <return>
45
WINDOWS platform
1. Open TC.exe
2. Open ‘assembler.c’ file
3. Compile and run the code
OUTPUT:
A file with name ‘output’ is created as shown below and is displayed on the output
screen.
2000 2A,01,25
2003 EB
2004 3A,03,25
2007 21,00,00
200A 0E, 08
200C 29
200D 17
200E D2, 12, 20
2011 19
2012 0D
2013 C2,0C,20
2016 22,04,25
2019 76
46
REFERENCES:
Book references:
1. Introduction to Computing Systems by
Yale N. Patel
Sanjay J.Patel
2. Fundamentals of Microprocessors and Microcomputers by
B.Ram
3. The C Programming Language by
Brian W. Kernighan
Dennis M. Ritchie
4. eBook on Assembly Language by
Christopher Morley
47
48