Design of Assembler For Any 15 Instructions of 8086 Using C
Design of Assembler For Any 15 Instructions of 8086 Using C
using C/C++
Rutul Gandhi Hitarth Patel
19BEC033 19BEC039
Dept. of Electronics and Communication Dept. of Electronics and Communication
Institute of Technology,Nirma University Institute of Technology,Nirma University
[email protected] [email protected]
I. I NTRODUCTION
Fig. 1. Assembler
Instruction is the language used to command a computer Function of a basic assembler are:
architecture and instruction set is the vocabulary of that
• Translation of mnemonic language code to its correspond-
language. The only way computers can represent information
is based on the level of electric signal, it may be high or low. ing object code.
• Assignment of machine addresses to corresponding sym-
Considering the limitation of 2 alternatives, the instructions
in the computer are represented using binary digits, i.e. 1 or bolic labels.
0. Such representation of instructions as a sequence of bits The processes performed inside the assembler are:
is known as machine language. To make it understandable by • Scanning (also known as tokenizing).
humans, we have an equivalent natural language known as • Parsing is the process of validating the instructions.
assembly language notation. There are 8 types of instructions • Creating the symbol table.
supported by 8086 microprocessor, some of which are data • Resolving the forward references.
transfer, arithmetic, bit manipulation, branch and loop instruc- • Converting into the machine language.
tions. Here, we have implemented some of these instructions In other words, Design of Assembler is:
and generated its equivalent machine instructions. • Converting mnemonic opcodes to its equivalent machine
language.
II. A SSEMBLY L ANGUAGE • Converting symbolic operands to its corresponding ma-
chine address.
An assembly language gives instructions to the processors • Converting data constants to the corresponding internal
for performing various tasks. It is unique for any processor. machine representation.
Assembly language is almost similar to Machine language • Writing object program and assembly listing.
but has easy language and code. Since machine language
comprises of 0s and 1s, it is difficult to write a program B. Types of Assembler
using it. Assembly language code can be written by using a Assembler is classified on the basis of a number of stages
compiler.It makes use of opcode for the instructions. Opcode it uses to convert assembly level language to machine level
primarily provides information about the specific instruction. language:
Opcode is represented in terms of symbols. This symbolic • One-Pass Assembler: One-Pass Assembler accomplishes
representation of opcode is known as Mnemonics which is the conversion of assembly level code to machine level
used by the programmer to remember the operation. code in a single step.
• Multi-Pass/Two-Pass Assembler: Multi-Pass or Two-Pass
Assembler assemblers first process the assembly level
code and then store its value in the opcode table and
symbol table. In the next step, machine level code is
generated using the opcode table and symbol table.
– Pass 1
∗ Defines Symbol table and Opcode table.
∗ Keep track of the location counter.
∗ Processing of pseudo instructions.
∗ Allocate address to each statement.
∗ Save the address allocated to all labels which are
to be used in Pass-2.
– Pass 2
∗ Conversion of the symbolic opcode into its corre-
sponding numeric opcode.
∗ Generation of machine code according to the
values of symbols and literals.
∗ Processes the assembler directives not done during
the Pass-1.
∗ Writing object program and assembly listing.
C. Assembler Design
Generating the symbol table and resolving forward refer-
ences should be taken care of while designing the assembler.
• Symbol Table:
III. I MPLEMENTATION
The objective is to design an assembler for an 8086 mi-
croprocessor using C language. Various features of C such
as file handling, hashing, data structure, pointers, linked list,
array, string functions are incorporated. The opcodes are stored
in the hash table. Hash table is used to map the keys to
its corresponding values. On the basis of hash table index,
the values can be stored at appropriate locations. It can be
implemented using an array of linked lists. The structure
named ‘Opcode’ is used for hashing using chaining. The
symbol table is made using a linked list to save space. A
function named ‘conBin’ is used to convert decimal to binary.
Hash table is used to store the opcodes being read. It
is generated using the following functions: ‘getHashIndex’,
‘insertAtIndex’, and ‘insertIntoHashMap’. It contains the in-
struction, code and format. Symbol table is generated using
the functions ‘getAddressCode’, ‘getRegisterCode’ and ‘get-
ConstantCode’. A specific 5 bit code is assigned to all the
registers and address bits. First pass is for the generation of
symbol tables and second for the generation of binary codes.
Here, we have created two text files, one containing the input
instructions and other containing the input opcodes. Fig. 2. Input Instructions
Fig. 3. Symbol Table
V. C ONCLUSION
In this paper, we have designed an assembler which can
detect the syntax error(if any) in the input instructions. The
output machine code file generated will be empty in such
cases. The assembler can generate the machine code for the
given set of instructions. On changing the instructions, the
machine code will change accordingly. Thus, this program can Fig. 4. Machine Code
generate machine code for any set of instructions provided by
the user.
ACKNOWLEDGEMENT
We would like to express our gratitude to Prof. Dhaval Shah
and Prof. Sachin Gajjar who provided us the opportunity to
make research on the topic of our interest and present it in the
form of a paper.
We would also like to thank them for their guidance and
support to us whenever it was needed. At the end we would
also like to thank the authors of the research papers that we
referred to and gained relevant information from it.
R EFERENCES
[1] Liu, Yu-Cheng, and Glenn A. Gibson. Microcomputer systems: The
8086/8088 family: Architecture, programming, and design. Prentice-
Hall, Inc., 2000.
[2] Carthy, Joe. An introduction to assembly language programming and
computer architecture. International Thomson Computer Press, 1995.
[3] 8086 Logical Instructions with Assembly Programming Examples (mi-
crocontrollerslab.com).
[4] C-Language and Subroutines (8086) (unb.ca).
[5] Instruction Set of 8086 - javatpoint.
[6] http://eceweb.ucsd.edu/ gert/ece30/CN2.pdf
[7] What is an Assembler? Assembly Language , Types, Differences
(toppr.com).
[8] C++ — asm declaration - GeeksforGeeks.
[9] Using Inline Assembly in C/C++ - CodeProject.
APPENDIX
#include <stdio.h>
#include <string.h>
#include <stdlib.h>